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riffi ,P THP 'NVENTION 

The present invention relates to the determination of the 

function of novel gene products. The invention further relates to Protein 
fragment Complementation Assays (RCA). PCAs allow for the detection 
of a wide variety of types of protein-protein. protein-RNA. protein-DNA. 
Protein-carbohydrate or protein-small organic molecule interactions .n 
different cellular contexts appropriate to the study of such interactions. 

n n r If r rni IMP THR INVgNTION 



Many processes in biology, including transcription, 
translation, and metabolic or signal transduction pathways, are mediated 
by noN-covalently-associated muWenzyme complexes' . The fom,ation 
of multiprotein or protein-nucleic acid complexes produce the most 
efficient chemical machinery. Much of modem biological research is 
concerned with identifying proteins involved in cellular processes, 
detemiining their functions and how. when, and where they interact with 
other proteins involved in specific pathways. Further, with rapid 
advances in genome sequencing projects there is a need to develop 
strategies to define "protein linkage maps", detailed inventories of protein 
interactions that make up functional assemblies of proteins^l Despite the 
Importance of understanding protein assembly in biological processes, 
there are few convenient methods for studying protein-protein interactions 
in vivo* '. Approaches include the use of chemical crosslinking reagents 
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and resonance energy transfer between dye-coupled proteins • . A 
powerful and commonly used strategy, the yeast two-hybrid system, is 
used to identify novel protein-protein interactions and to examine the 
amino acid determinants of specific protein interactions* ". The approach 
allows for rapid saeening of a large number of clones, including cDNA 
libraries. Limitations of this technique include the fact that the interaction 
must occur in a specific context (the nucleus of S. cerevisiae), and 
generally cannot be used to distinguish induced versus constitutive 
interactions. 

Recently, a novel strategy for detecting protein-protein 
interactions has been demonstrated by Johnsson and Varshavsky^°* 
called the ubiquitin-based split protein sensor (USPS)'. The strategy is 
based on cleavage of proteins with /V-tenninal fusions to ubiquitin by 
cytosolic proteases (ubiquitinases) that recognize its tertiary structure. 
The strategy depends on the reassembly of the tertiary structure of the 
protein ubiquitin from complementary N- and C-temiinal fragments and 
crucially, on the augmention of this reassembly by oligomerization 
domains fused to these fragments. Reassembly is detected as specific 
proteolysis of the assembled product by cytosolic proteases 
(ubiquitinases). The authors demonstrated that a fusion of a reporter 
protein-ubiquitin C-terminal fragment could also be cleaved by 
ubiquitinases, but only if co-expressed with an W-terminal fragment of 
ubiquitin that was complementary to the C-terminal fragment. The 
reconstitution of observable ubiquitinase activity only occurred if the N- 
and C-termlnal fragments were bound through GCN4 leucine 
zippers"* "". The authors suggested that this "split-gene" strategy could 
be used as an in vivo assay of protein-protein interactions and analysis 
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Of protein assembly kinetics in cells. Unfortunately, this strategy requires 
additional cellular factors (in this case ubiquitinases) and the detection 
method does not lend itself to high-throughput screening of cDNA 
libraries. 

Rossi. F.. C. A. Chariton, and H. M. Blau (1997) Proc. 
Nat. Acad. Sci. (USA) 94. 8405-8410) have reported an assay based on 
the classical complementation of a and w fragments of b-galactosldase 
(b-gal) and induction of complementation by induced oligomerization of 
the proteins FKBP12 and the mamalian target of rapamycin by rapamycin 
in transfected C2C12 myoblast cell lines. Reconstitution of b-gal activity 
is detected using substrate fluorescein di-b-D-galactopyranoside using 
several fluorecence detection assays. While this assay bears some 
resemblance to the present invention, there are several significant 
distinguishing differences. First, this particular complementation 
approach has been used for over thirty years in a vast number of 
applications including the detection of protein-protein interactions. 
Krevolin. M. and D. Kates (1993) U.S. Patent No. 5.362.625) teaches the 
use of this complementation to detect protein-protein interactions. Also 
achievement of b-gal complementation in mamalian cells has previously 
been reported (Moosmann. P. and S. Rusconi (1996) Nucl. Acids Res. 
24. 1171-1172). The individual PCAs presented here are completely de 
novo designed interaction detection assays, not described in any v^ay 
previously except for publications arising from applicants laboratory. 
Secondly, this application describes a general strategy to develop 
molecular interaction assays from a large number of enzyme or protein 
detectors, all de novo designed assays. v>^hereas the b-gal assay is not 
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novel, nor are any general strategies or advancements over previosly well 

documented applications given. 

As in the USPS, the yeast-two hybrid strategy requires 
additional cellular machinery for detection that exist only in specific 

5 cellular compartments. There is therefore a need for a detection system 
which uses the reconstitution of a specific enzyme activity from fragments 
as the assay itself, without the requirement for other proteins for the 
detection of the activity. Preferably, the assay would involve an 
oligomerization-assisted complementation of fragments of monomeric or 

10 multimeric enzymes that require no other proteins for the detection of 
their activity. Furthennore, if the structure of an enzyme were known it 
would be possible to design fragments of the enzyme to ensure that the 
reassembled fragments would be active and to introduce mutations to 
alter the stringency of detection of reassembly. However, knowledge of 

1 5 structure is not a prerequesite to the design of complementing fragments, 
as will be explained below. The flexibility allowed in the design of such 
an approach would make it applicable to situations where other detection 

systems may not be suitable. 

Recent advances in human genomics research has 

20 led to rapid progress in the identification of novel genes. In applications 
to biological and pharmaceutical research, there is now the pressing need 
to determine the functions of novel gene products; for example, for genes 
shown to be involved in disease phenotypes. It is in addressing 
questions of function where genomics-based pharmaceutical research 

25 becomes bogged down and there is now the need for advances in the 
developnf)ent of simple and automatable functional assays. A first step 
in defining the function of a novel gene is to detemiine its interactions with 
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Other gene products in an appropriate context; that is. since proteins 
make specific interactions wrth other proteins or other biopolymers as part 
of functional assemblies, an appropriate way to examine the function of 
a novel gene is to detennine its physical relationships with the products 
of other genes. 

Screening techniques for protein interactions, such as 
the yeast •two-hybrid" system, have transformed molecular biology, but 
can only be used to study specific types of constitutively interacting 
proteins or interactions of proteins with other molecules, in narrowly 
defined cellular and compartmental contexts and require a complex 
cellular machinery (transcription) to work. To rationally screen for protem 
interactions within the context of a specific problem requires more flexible 
approaches. Specifically, assays that meet criteria necessary not only 
to detecting molecular interactions, but also to validating these 
interactions as specific and biologically relevant. 

A list of assay characteristics that meet such criteria are 

as follows: 

1) Allow for the detection of protein^jrotein. protein-DNA/RNA or protein- 
drug interactions in vivo or in vitro. 

2) Allow for the detection of these interactions in appropriate contexts, 
such as within a specrfic organism, cell type, cellular compartment, or 
organelle. 

3) Allow for the detection of induced i^ersus constitutive protein-protein 
interactions (such as by a cell growth or inhibitory factor). 

4) To be able to distinguish specific versus non-specific protein-protein 
interactions by controlling the sensitivity of the assay. 

5) Allow for the detection of the kinetics of protein assembly in cells. 
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6) Allow for screening of cDNA, small organic molecule, or DMA or RNA 
libraries for molecular interactions. 

«^'"VI!V!A'^Y"FTHg INVENTION 

5 The present Invention seeks to provide the above- 

mentioned needs for which the prior art is silent. The present invention 
provides a general strategy for detecting protein interactions with other 
biopolymers including other proteins, nucleic acids, carbohydrates or for 
screening small molecule libraries for compounds of potential therapeutic 

10 value. In a preferred embodiment, the instant invention seeks to provide 
an oligomerization-assisted complementation of fragments of monomeric 
enzymes that require no other proteins for the detection of their activity. 
In one such embodiment, a protein-fragment complementation assay 
(PCA) based on reconstitution of dihydrofolate reductase activity by 

15 complementation of defined fragments of the enzyme in £. co// is 
hereby provided. This assay requires no additional endogenous factors 
for detecting specific protein-protein interactions (i.e. leucine zipper 
interactions) and can be conveniently extended to screening cDNA, 
nucleic acid, small molecule or protein design libraries for molecular 

20 interactions. In addition, the assay can also be adapted for detection of 
protein interactions in any cellular context or compartment and be used 
to distinguish between induced versus constitutive protein interactions in 
both prokaryotic and eukaryotic systems. 

One particular strategy for designing a protein 

25 complementation assay (PCA) is based on using the following 
characteristics: 1) A protein or enzyme that is relatively small and 
monomeric, 2) for which there is a large literature of structural and 
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functional information. 3) for which simple assays exist for the 
reconstitution of the protein or activity of the enzyme, both in vivo and in 
w<ro.and4)forwhichoverexpression ineukaryotic and prokaryotic cells 
has been demonstrated. If these criteria are met. the structure of the 
enzyme is used to decide the best position In the polypeptWe chain to 
split the gene in two. based on the following criteria: 1) The fragments 
should result in subdomains of continuous polypeptide; that is. the 
resulting fragments will not disrupt the subdomain structure of the protein. 
2) the catalytic and cofactor binding sites should all be contained in one 
fragment, and 3) resulting new N- and Ctemiini should be on the same 
face of the protein to avoid the need for long peptWe linkers and allow for 
studies of orientation-dependence of protein binding. 

It should be understood that the above mentioned 
criteria do not all need to be satisfied for a proper woricing of the present 
invention. It is an advantage that the enzyme be small, preferably 
between 10-40 kDa. Although monomeric enzymes are preferred, 
multimeric enzymes can also be envisaged as within the scope of the 
present invention. The dimeric protein tyrosinase can be used in the 
instant assay. The infomiation on the structure of the enzyme provides an 
additional advantage in designing the PCA. but is not necessary. Indeed, 
an additional strategy, to develop PCAs Is presented, based on a 
combination of exonuclease digestion-generated protein fragements 
followed by directed protein evolution in application to the enzyme 
aminoglycoside kinase. Although the overexpression In prokaryotic cells 
is prefen-ed it is not a necessity. It will be understood to the skilled artisan 
that the enzyme catalytic site (of the chosen enzyme) does not absolutely 
need to be on same molecule. 
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The present application explains the rationale and 
criteria for using a particular enzyme in a PCA. Figure 1 shows a 
general description of a PCA. The gene for a protein or enzyme is 
rationally dissected into two or more fragments. Using molecular biology 
techniques, the chosen fragments are subcloned, and to the 5' ends of 
each, proteins that either are Icnown or thought to interact are fused. Co- 
transfection or transformation these DNA constructs into cells is then 
carried out. Reassembly of the probe protein or enzyme from Its 
fragments is catalyzed by the binding of the test proteins to each other, 
and reconstitution is observed with some assay. It is crucial to understand 
that these assays will only work if the fused, interacting proteins catalyze 
the reassembly of the enzyme. That is. observation of reconstituted 
enzyme activity must be a measure of the interaction of the fused 
proteins. 

; A preferred embodiment of the present invention 

focuses on a PCA based on the enzyme dihydrofolate reductase. 
Expansion of the strategy to include assays in eukaryotic, cells, library 
screening, and a specific application to problems concerning the study of 
integrated biochemical pathways such as signal transduction pathways, 

) is presented. Additional assays, including those based on enzymes that 
can act as dominant or recesive drug selection or metabolic salvage 
pathways are disclosed. In addition. PCAs based on enzymes that will 
produce a colored or fluorescent product are also disclosed. The present 
invention teaches how the PCA strategy can be both generalized and 

5 automated for functional testing of novel genes, screening of natural 
products or compound libraries for phannacological activity and 
identification of novel gene products that interact with DNA, RNA or 



wo 98/34120 



PCT/CA98/00068 



9 



carbohydrates are disclosed. It also teaches how the PCA strategy can 
be appl^d to identifying natural products or small molecules from 
compound libraries of potential therapeutic value that can inhibit or 
activate such molecular interactions and how enzyme substrates and 
, small molecule inhibitors of enzymes can be identified. Finally, it teaches 
how the PCA strategy can be used to perfomi protein engineering 
experiments that could lead to designed enzymes with industrial 
applications or peptides with biological activity. 

Simple strategies to design and implement assays for 
3 detecting protein interactions in vivo are disclosed herein. We have 
designed complementary fragments of tiie native mDHFR that, when 
coexpressed in £. ooli grown in minimal medium, allow for sun/ival of 
clones expressing the two fragments, where the basal activity of the 
endogenous bacterial DHFR is inhibited by the competitive inhibrtor 
5 trimetiioprim (Fig. 3). Reconstrtution of activity only occurred when both 
N- and C-temiinal fragments of DHFR were coexpressed as C-terminal 
fusions to GCN4 leucine zipper sequences, indicating that reassembly of 
the fragments requires Ibmiation of a leucine zipper between the N- and 
C-tem.inal fusion peptides. The sequential increase in cell doubling 
20 times resulting from the destabilizing mutations directed at the assembly 
interface (lle1 14 to Val. Ala or Gly) demonsti-ates that the observed cell 
survival under selective conditions is a result of the specific, leucine- 
zipper-assisted association of mDHFR fragmentll .2] with fragment[3]. as 
opposed to nonspecific interactions of Z-FIS] witii Z-F11.2]. several 
25 detailed and many additional examples are given. 

As demonstrated previously with the ubiquitin-based 
split protein sensor (USPS)', a protein-fragment complementation 
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strategy can be used to study equilibrium and kinetic aspects of protein- 
protein interactions in vivo. The DHFR and other PCAs however, are 
simpler assays. They are complete systems; no additional endogenous 
factors are necessary and the results of complementation are observed 
5 directly, with no further manipulation. The £ coli cell survival assay 
described herein should therefore be particularly useful for screening 
cDNA libraries for protein-protein interactions. mDHFR expression in 
cells can be monitored by binding of fluorescent high-affinity substrate 

analogues for DHFR^. 

-10 There are several further aspects of the PCAs that 

distinguish them from all other strategies for studying protein-protein 
interactions in vivo (except USPS). We have designed complementary 
fragments of enzymes that allow for controlling the stringency of the 
assay, and could be used to obtain estimates of the kinetics and 

1 5 equilibrium constants for association of two proteins. For example, with 
DHFR the point mutations of the wild-type enzyme lie 1 14 to Val. Ala, or 
Gly alter the stringency of reconstitution of DHFR activity. For 
determining estimates of equilibrium and kinetic parameters for a specific 
protein-protein interaction, one could perform a series of DHFR PCA 

20 experiments with two proteins that interact with a known affinity, using the 
wild type or destabilizing mutant DHFR fragments. Comparison of cell 
growth rates in this model system with rates for a DHFR PCA using 
unknowns would give an estimate of the strength of the unknown 
interaction. 

25 It should be understood that the present invention 

should not be limited to the DHFR or other PCAs presented, as it is only 
non-limiting embodiments of the protein complementation assay of the 
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present invention. Moreover, the PCAs should not be limited in the 
context in v/hich they could be used. Constructs could be designed for 
targeting the PCA fusions to specific compartments in the cell by addition 
of signaling peptide sequences"" Induced versus constitutive protein- 
protein interactions could be distinguished by a eukaryotic version of the 
PCA. in the case of an interaction that is triggered by a biochemical 
event. Also, the system could be adapted for use in screening for novel, 
induced protein-molecular associations between a target protein and an 

expression library. 

The instant invention is also directed to a method for 

detecting biomolecular interactions said method comprising: 

(a) selecting an appropriate reporter molecule: 

(b) effecting fragmentation of said reporter molecule 
such that said fragmentation results in reversible loss of reporter function; 

(c) fusing or attaching fragments of said reporter 

molecule separately to other molecules; followed by 

(d) reassociatlon of said reporter fragments through 
interactions of the molecules that are fused to said fragments. 

The invention also provides molecular fragment 
complementation assays for the detecUon of molecular interactions 
comprising a reassembly of separate fragments of a molecule, wherein 
reassembly of said fragments is operated by the interaction of molecular 
domains fused to each fragment of said molecules, and wherein 
reassembly of the fragments is independent of other molecular 
25 processes. 

In another aspect, the present invention is directed to a 
method of testing biomolecular interactions comprising: 



15 



20 
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a) generating a first fusion product comprising 

i) a first fragment of a first molecule and 

ii) a second molecule which is different or the 

same as said first molecule; 
5 b) generating a second fusion product comprising 

i) a second fragment of said first molecule; and 

ii) a third molecule which is different from or the 

same as said first molecule or second molecule; 

c) allowing the first and second fusion products to 

10 contact each other; and 

d) testing for activity regained by association of the 

recombined fragments of the first molecule, wherein said reassociation is 
mediated by interaction of the second and third molecules. 

In another novel feature, the invention is directed to a 
15 method comprising an assay where fragments of a first molecule are 
fused to a second molecule and fragment association is detected by 
reconstitution of the first molecule's activity. 

The present invention also provides a composition 
comprising a product selected from the group consisting of: 
20 (a) a first fusion product comprising: 

1) a first fragment of a first molecule whose 
fragments can exhibit a detectable activity when associated and 

2) a second molecule that can bind (a)(1); 
(b) a second fusion product comprising 

25 1 ) a second fragment of said first molecule and 

2) a third molecule that can bind (b)(1); and 
c) both (a) and (b). 
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The invention further provides a composition comprising 
complementary fragments of a first molecule, each fused to a separate 

fragment of a second molecule. 

The inventors of the present subject matter further 

provide a composition comprising a nucleic acid molecule coding for a 
fusion product, which molecule comprises sequences coding for a 
product selected from the group consisting of: 

(a) a first fusion product comprising: 

1 ) fragments of a first molecule vi^hose fragments 

can exhibit a detectable activity when associated and 

2) a second molecule fused to the ftagment of the 

first molecule; 

(b) a second fusion product comprising 

1 ) a second fragment of said first molecule and 

2) a second or third molecule; and 

(c) both (a) and (b). 

The present invention is also directed to a method of 
testing for biomolecular interactions associated with: (a) complementary 
fragments of a first molecule whose fragments can exhibit a detectable 
activity when associated or (b) binding of two protein-protein interacting 
domains from a second or third molecule, said method comprising: 

1) creating a fusion of 

(a) a first fragment of a first molecule whose 
fragments can exhibit a detectable activity when associated and 

(b) a first protein-protein interacting domain; 

2) creating a fusion of 

(a) a second fragment of said first molecule and 
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(b) a second protein-protein interacting domain 
that can bind said first protein-protein interacting domain; 

3) allowing the fusions of (1) and (2) to contact each 

other; and 

5 4) testing for said activity. 

The instant inevntion further provides a composition 
comprising a product selected from the group consisting of: 

(a) a first fusion product comprising: 

1 ) a first fragment of a molecule whose fragments 
10 can exhibit a detectable activity when associated and 

2) a first protein-protein interacting domain; 

(b) a second fusion product comprising 

1 ) a second fragment of said first molecule and 

2) a second protein-protein interacting domain 
15 that can bind said first protein-protein interacting domain; and 

(c) both (a) and (b). 

The invention is also directed to a composition 
comprising a nucleic acid molecule coding for a fusion product, which 
molecule comprises sequences coding for either: 
20 (a) a first fusion product comprising: 

1 ) a first fi-agment of a molecule whose fragments 
can exhibit a detectable activity when associated and 

2) a first protein-protein interacting domain; or 
(b) a second fusion product comprising 

25 1 ) a second fragment of said molecule and 

2) a second protein-protein interacting domain 
that can bind said first protein-protein interacting domain; or 
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(c) both (a) and (b). 

The invention also provides a method of detecting 
kinetics of protein assembly and screening cDNA libraries comprising 
perfomiing PCA. 

5 In another embodiment, the invention further provides 

a method of testing the ability of a compound to inhibit molecular 
interactions in a PCA comprising performing a PCA in the presence of 
said compound and correlating any inhibition with said presence. 

In a further embodiment, the invention provides a 

10 method for detecting protein-protein interactions in living organisms and 

or cells, which method comprises: 

(a) synthesizing probe protein fragments from an 
enzyme which enables dominant selection by dissecting the gene coding 
for the enzyme into at least two fragments; 
^5 (b) constmcting fusion proteins with one or more 

molecules that are to be tested for interactions; 

(c) fusing the proteins obtained in (b) vwth one or more 

of the probe fragments; 

(d) coexpressing the fusion proteins; and 

ft 

2Q (e) detecting the reconstitution of enzyme activity. 

The invention still provides a method for detecting 
biomolecular interactions said method comprising: 

(a) selecting an appropriate reporter molecule; 

(b) effecting fragmentation of said reporter molecule; 
25 (c) fusing or attaching fragments of said reporter 

molecule separately to other molecules; followed by 
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(d) reassociation of said reporter fragments through 
interactions of the molecules that are fused to said fragments. 

Lastly, the invention also provides a novel method of 
affecting gene therapy, which includes the step of providing the assays 
5 and compositions described above. 

The present invention is pionneering as it is the first 
protein complementation assay displaying such a level of simplicity and 
versatility. The exemplified embodiments are protein-fragment 
complementation assays (PCA) based on mDHFR. where a leucine 
10 zipper directs the reconstitution of DHFR activity. Activity was detected 
by an £. coli survival assay which is both practical and inexpensive. This 
system illustrates the use of mDHFR fragment complementation in the 
detection of leucine zipper dimerization and could be applied to the 
detection of unknown, specific protein-molecular interactions in vivo. 
^5 It should be undertstood that the instant invention is not 

limited to the PCAs presented here, as numerous other enzymes can be 
selected and used in accordance with the teachings of the present 
invention. Examples of such markers can be found in Kaufinan, (1987 
Genetic Eng. 9:155-198) and references found therein as well as table 1 

20 of this application. 

It should also be clear to the skilled artisan to which the 
present invention pertains that the invention is not limited to the use of 
leucine zippers as the two interacting molecules. Indeed, numerous other 
types of protein-molecule interactions can be used and identified in 

25 accordance with the teaching of the present invention. The known types 
of motifs involved in protein-molecular interactions are well known in the 
art. 
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The present application refers to numerous prior art 
documents and the entire contents of all those prior art documents are 

herein incorporated by reference. 

Other features and advantages of the present invention 
v^ill be apparent from the following description of the preferred 
embodiments thereof, the appended Examples and from the enjoined 

claims. 



fINGS 



FIG. 1 provides a general description of a PCA. Using 
molecular biology techniques, the chosen fragments of the enzyme are 
subcloned. and to the 5' ends of each, proteins that either are known or 
thought to interact are fused. Co-transfection or transfomiation these 
DNA constructs into ceHs is then carried out and reconstitution with some 

1 5 assay is obsen/ed. 

FIG. 2 is a scheme of the fusion constructs used in one 

of the embodiments of the invention. The hexahistidine peptide (6His), 
the homodimerizing GCN4 leucine zipper (Zipper) and mDHFR fragments 
(1. 2 and 3) are illustrated. The labels for the constructs are used to 
20 identify both the DNA constructs and the proteins expressed from these 
constructs. 

FIG. 3: (A) shows E. coli survival assay on minimal 
medium plates. Control: Left side of the plate: £ coli harboring pQE-30 
(no insert); right side.- E. co/Zharboring pQE-16. coding for native mDHFR. 
25 Panel I: Left side of each plate: transformation with construct Z-Fll .21; 
right side of each plate: transfomiation with construct Z-FI3]. Panel II: 
Cotransfomiation with constructs Z-F11.21 and Z-F[31. Panel III: 
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Cotransformation with constructs Ccntrol-F[1,2] and Z-F[3]. All plates 
contain 0.5 mg/ml trimethoprim. In panels I to III. plates on the right side 

contain ImM IPTG. 

(B) £. coli survival assay using destabilizing DHFR 

5 mutants. Panel I: Cotransfomiation of E. coli with constructs Z-F[1 ,2] and 
Z-F(3:lle1 14Val]. Panel II: Cotransformation with Z-FI1,2] and Z- 
F[3:lle1 14Ala]. Inset is a 5-fold enlargement of the right-side plate. Panel 
III: Cotransformation with Z-F[1 ,2] and Z-Fl3:lle1 14Gly]. All plates contain 
0.5 mg/ml trimethoprim. Plates on the right side contain ImM IPTG. 

^ Q FIG. 4 features the coexpression of mDHFR fragments. 

(A) Agarose gel analysis of restriction patlem resulting from Hindi 
digestion of plasmid DNA. Lane 1 contains DNA isolated from E. coli 
cotransformed with constructs Z-F[1 .2] and Z-F[3]. Lanes 2 and 3 contain 
DNA isolated from £ coli transformed with, respectively, construct Z-F[3] 

15 and constnjctZ-F[1.2]. Fragment migration (in bp) is indicated to the 

right. 

(B) SDS-PAGE analysis of mDHFR fragment 
expression. Lanes 1 to 5 show crude lysate of untransformed E. coli 
(lane 1). or E coli expressing Z-F[1.2] (20.8 kDa; lane 2). Z-F(3] (18.4 

20 kDa; lane 3). Control-F[1 ,2] (14.2 kDa; lane 4). and Z-FI1 .2] + Z-F[3] (lane 
5). Lane 6 shows 40 ml out of 2ml copurified Z-F[1,2] and Z-F[3]. 
Arrowheads point to the proteins of interest. Migration of molecular 
weight mari<ers (in kDa) is indicated to the right. 

FIG. 5 illustrates the general features of a PCA based 

25 on a survival assay such as the DHFR PCA. The assay can be used in 
a bacterial or a mammalian context. The inserted target DNA can be a 
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known sequence coding for a protein (or protein domain) of interest, or 

can be a cDNA library. 

FIG. 6 represents an autoradiograph of a COS cell 

lysate after a 30 min. »S-Met-Cys pulse-labelling. The expression pattern 
is essentially identical to that observed in E coli (see Fig. 4). The DNA 
transfected into the cells (or cotransfected) is indicated above the 
respective lanes. 

FIG. 7 illustrates the results of a protein engineering 
application of the mDHFR bacterial PCA. Tv^o semi-random leucine 
zipper libraries v^ere created (as described in the text) and each inserted 
A/-terminal to one of the mDHFR fragments. Cotransfomiation of the 
resulting zipper-DHFR fragment libraries in £. coA and plating on selective 
medium allov^ed for survival of clones harboring successfully interacting 
leucine zippers. Fourteen clones were isolated and the zippers were 
sequenced to identify the residues at the 6e6 and 6g0 positions. The 
0e-g6 pairs were categorized, as having attractive pairing 
(charge:charge. charge:neutral polar or neutral polarneutral polar) or 
repulsive pairing (chargexharge) and the number of each type of 
interaction scored for each clone. The total number of interactions for 
each clone is 6; the interactions are tallied on the histogram. 

Other objects, advantages and features of the present 
inventton will become more apparent upon reading of the following non- 
restrictive description of preferred embodiments with reference to the 
accompanying drawings which are examplary and should not be 
25 interpreted as limiting the scope of the present invention. 



15 
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nFRnp ipjinN OF THE P RFFERRED EMBODIMENTS 

g«>io^tjffn mPHFR for a PCA 

In designing a protein-fragnient complementation assay 
(PCA). we sought to identify an enzyme for which the following is true: 1) 

5 An enzyme that is relatively small and monomeric, 2) for which stmctural 
and functional infonnation exists, 3) for which simple assays exist for both 
in vivo and in vitro measurement, and 4) for which overexpression in 
eukaryotic and prokaryotic cells has been demonstrated. Murine DHFR 
(mDHFR) meets all of the criteria for a PCA listed above. Prokaryotic and 

10 eukaryotic DHFR is central to cellular one-carbon metabolism and is 
absolutely required for cell survival in both prokaryotes and eukaryotes. 
Specifically it catalyses the reduction of dihydrofolate to tetrahydrofolate 
for use in transfer of one-carbon units required for biosynthesis of serine, 
methionine, purines and thymidylate. The DHFRs are small (17 kD to 21 

15 kD), monomeric proteins. The crystal structures of DHFR from various 
bacterial and eukaryotic sources are known and substrate binding sites 
and active site residues have been detemnined allowing for rational 
design of protein fragments. The folding, catalysis, and kinetics of a 
number of DHFRs have been studied extensively^ The enzyme 

20 activity can be monitored in vitro by a simple spectrophotometric assay'", 
or in vivo by cell survival in cells grown in the absence of DHFR end 
products. DHFR is specifically inhibited by the anti-folate drug 
trimethoprim. As mammalian DHFR has a 12000-fold lower affinity for 
trimethoprim than does bacterial DHFR"\ growth of bacteria expressing 

25 mDHFR in the presence of trimethoprim levels lethal to bacteria is an 
efficient means of selecting for reassembly of mDHFR fragments into 
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active enzyme. High level expression of mDHFR has been demonstrated 
in transformed prokaryote or transfected eukaryotic cells 



122-126 



mDHFR shares high sequence kJentity with the human 
DHFR (hDHFR) sequence (91% identity) and is highly homologous to the 
£ CO// enzyme (29% identity, 68% homology) and these sequences share 
visually superimposable tertiary structure"\ Comparison of the crystal 
structures of mDHFR and hDHFR suggests that their active sites are 
essentially identical"^ '^. DHFR has been described as being fomied of 
three stmctural fragments fomiing two domains^"' ^» the adenine binding 
domain (residues 47 to 105 = fragment[21) and a discontinuous domain 
(residues 1 to 46 = fragment[11 and 106 to 186 [31; numbering according 
to the murine sequence). The folate binding pocket and the NADPH 
binding groove are fomied mainly by resklues belonging to fragments[11 
and [2]. Fragment [3] is not directly implicated in catalysis. 

Residues 101 to 108 of hDHFR. at the junction between 
fragmentl2] and fragment[31. fomi a disordered loop which lies on the 
same face of the protein as both termini. Wte chose to cleave mDHFR 
between fragments [1,2] and [3], at residue 107. so as to cause minimal 
disruption of the active site and NADPH cofactor binding sites. The 
native N- terminus of mDHFR and the novel W-temiinus created by 
cleavage occur on the same surface of the enzyme'" allowing for ease 
of A/-terminal covalent attachment of each fragment to associating 
25 fragments such as the leucine zippers used in this study. Using this 
system, we have obtained leucine-zipper assisted assembly of the 
mDHFR fragments into active enzyme. 
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EXAMPLE 1 
Sy;PFRIMENTAL P ROTOCOL 

pMA Constructs 

5 Mutagenic and sequencing oligonucleotides were 

purchased from Gibco BRL. Restriction endonucleases and DNA 
modifying enzymes were from Phamiacia and New England Biolabs. The 
mDHFR fragments carrying their own iN-frame stop codon were 
subcloned into pQE-32 (Qiagen), downstream from and iN-frame with the 

10 hexahistidine peptide and a GCN4 leucine zipper (Fig 1 ; Fig. 2). All final 
constructs were based on the Qiagen pQE series of vectors, which 
contain an inducible promoter-operator element (tac), a consensus 
ribosomal binding site, initiator codon and nucleotides coding for a 
hexahistidine peptide. Full-length mDHFR is expressed from pQE-16 

15 (Qiagen). 

f-ypf^ ysion vector harboring the GCN4 leucine ziPPer 

Residues 235 to 281 of the GCN4 leucine zipper (a 
Sall/BamHI 254 bp fragment) were obtained from a yeast expression 

20 plasmid pRS316^ The recessed tenninus at the BamHI site was filled-in 
with Klenow polymerase and the fragment was ligated to pQE-32 
linearized with Sall/Hindlll(filled-in). The product, construct Z, cames an 
open reading frame coding for the sequence Met-Arg-Gly-Ser followed by 
a hexahistidine tag and 13 residues preceding the GCN4 leucine zipper 

25 residues. 
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rra^\\f f\ of D HFR fragments: 

The eukaryotic transient expression vector, pMT3 
(derived from pMT2)«. was used as a template for PCR-generation of 
mDHFR containing the features allowing subcbning and separate 
expression of fragmentH .2] and fragmentl3]. The megaprimer method of 
PGR mutagenesis" was used to generate a full-length 590 bp product. 
Oligonucleotides complementary to the nucleotide sequence coding for 
the N- and C-temiini of mDHFR and containing a novel BspEI site outside 
the coding sequence were used as well as an oligonucleotide used to 
create a novel stop codon after fragment[1 .2], followed by a novel Spel 
site for use in subcloning fragmentlS]. 

Complementary oligonucleotides containing the novel 
restriction sites: SnaBI. Nhel. Spel and BspEI. were hybridized together 
resulting in 5' and 3' overhangs complementary to EcoRI, and inserted 
into pMT3 at a unique EcoRI site. The 590 bp PGR product (described 
above) was digested with BspEI and inserted into pMT3 linearized at 
BspEI. yielding construct 11.2.3]. The 610 bp BspEI/EcoNI fragment 
(coding for DHFR fragmentll .2], followed by a novel stop and fragmentlSJ 
up to EcoNI) was filled in at EcoNI and subcloned into pMT3 opened with 
BspEI/Hpal. yielding construct F[1.2]. The 250 bp Spel/BspEI fragment 
of constmct [1.2.3] coding for DHFR fragmentpj (with no in-frame stop 
codon) was subcloned into pMT3 opened with the same enzymes. The 
stop codon of the wild-type DHFR sequence, downsfream from 
fragmentl3] in pMT3. was inserted as follows. Gleavage with EcoNI. 
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present in both the inserted fragment[3] and the wild-type fragment[3], 
removal of the 683 bp intervening sequence and religation of the vector 
yielded a construct of fragment[3] with the wild-type stop codon. constmct 
F13]. 

5 

^r??tiP P ex pression constructs 

The 1051 bp and the 958 bp SnaBI/Xbal fragments of 
constructs F[1,2] and F[3], respectively, were subcloned into construct Z 
opened with Bglll(filled-in)/Nhel. yielding constmcts Z-F(1,2] and Z-F[3] 
0 (Fig. 2). For the Control expression construct, the 180 bp Xmal/BspEI 
fragment coding for the zipper was removed from construct Z-F[1 ,2], 
yielding constmct Control-F(1 .2] (Fig. 2). 

f^- rff j itinn of Stability Mutants 

1 5 Site-directed mutagenesis was performed*° to produce 

mutants at llel 14 (numbering of the wild-type mDHFR). The mutagenesis 
reaction was carried out on the KpnI/BamHI fragment of construct Z-F[3] 
subcloned into pBiuescript SK+ (Stratagene), using oligonucleotides that 
encode a silent mutation producing a novel BamHI site. The 206 bp 

20 Nhel/EcoNI fragment of putative mutants identified by restriction was 
subcloned back into Z-F13]. The mutations were confimied by DNA 
sequencing. 

g, ff ff/f Survival Assav 

25 E. coli strain BL21 carrying plasmid pRep4 (from 

Qiagen, for constitutive expression of the lac repressor) were made 
competent, transformed with the appropriate DNA constructs and washed 
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twice with minimal medium before plating on minimal medium plates 
containing 50 mg/ml kanamycin. 100 mg/ml ampicillin and 0.5 mg/ml 
trimethoprim. One half of each transfomiation mixture was plated in the 
absence, and the second hatf in the presence, of 1 mM IPTG. All plates 
were placed at 37°C for 66 hrs. 

p, yp ff Grovy th Curves 

Colonies obtained from cotransfonnation were 

propagated and used to inoculate 10 ml of minimal medium 
supplemented with ampicillin. kanamycin as well as IPTG (ImM) and 
trimethoprim (1 mg/ml) where indicated. Cotransfomiants of Z-F11.21 * 
Z-Fl3:lle1 UGly] were obtained under non-selective conditions by plating 
the transfomiation mixture on L-agar (+ kanamycin and ampicillin) and 
screening for the presence of the two constructs by restriction analysis. 
All growth curves were perfomied in triplicate. Aliquots were withdrawn 
periodically for measurement of optical density. Doubling time was 
calculated for early logarithmic growth (OD 600 between 0.02 and 0.2). 




Bacteria w»re propagated in Terrific Broth'^ in the 
presence of the appropriate antibiotics to an OD600 of approximately 1.0. 
Expression was induced by addition of 1 mM IPTG and further incubation 
for 3 hrs. For analysis of crude extract, pellets from 150 ml of induced 
cells were lysed by boiling in loading dye. The lysates were clarified by 
microcentrifugation and analyzed by SDS-PAGE32. For protein 
purification, a cell pellet from 50 ml of induced £ coli cotransfomied with 
constructs Z-F11.21 and Z-F[3] was lysed by sonication. and a denaturing 
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purification of the insoluble pellet undertaken using Ni-NTA (Qiagen) as 
described by the manufacturer. The proteins were eluted with a stepwise 
imidazole gradient. The fractions were analyzed by SDS-PAGE. 

BESULIS 

fry qn^^pts for a PCA 

mDHFR shares high sequence identity with the human 
DHFR (hDHFR) sequence. As the coordinates of the murine crystal 
structure were not available, we based our design considerations on the 

10 hDHFR structure. DHFR has been described as comprising three 
structural fragments forming two domains: the adenine binding domain 
(F(21) and a discontinuous domain (F[1] and F[3])" The folate binding 
pocket and the NADPH binding groove are formed mainly by residues 
belonging to F[1] and F[2]. Residues 101 to 108 of hDHFR form a 

1 5 disordered loop which lies on the same face of the protein as both termini. 
This loop occurs at the junction between F[2] and F[3]. By cleaving 
mDHFR at residue 107. we created F[1 ,2] and F[3], thus causing minimal 
disruption of the active site and substrate binding sites. The native W- 
terminus of mDHFR and the novel A/-temiinus created by cleavage were 

20 covalently attached to the C-termini of GCN4 leucine zippers (Fig. 1 ). 



g, jj Qfi Surviv al Assays 

Figure 2 illustrates the general features of the expressed 
constructs and the nomenclature used in this study. Figure 3 (panel A) 
illustrates the results of cotransformation of bacteria with constructs 
coding for Z-FI1.2] and Z-FIS], in the presence of trimethoprim, cleariy 
showing that colony growth under selective pressure is possible only in 
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cells expressing both fragments of mDHFR. There is no growth in the 
presence of either Z-F[1.21 or Z-F[3] alone. Induction of protein 
expression with IPTG is essential for colony growth (Fig. 3A). The 
presence of the leucine zipper on both fragments of mDHFR is essential 
as illustrated by cotransfomiation of bacteria with both vectors coding for 
mDHFR fragments, only one of which carries a leucine zipper (Fig. 3A). 
It should be noted that growth of control £ co// transfomied with the full- 
length mDHFR is possible in the absence of IPTG due to low levels of 

expression in uninduced cells. 

Confirmation of the presence of both plasmids in 
bacteria able to grow with trimethoprim was obtained from restriction 
analysis of the plasmid DNA purified from isolated colonies. Figure 4 (A) 
reveals the presence of the 1200 bp Hindi restriction fragment from 
constmct Z-F11.2] as well as the 487 and 599 bp Hindi restriction 
fragments from construct Z-F[3]. Also present is the 935 bp Hindi 
fragment of pRep4. Overexpression of the fusion proteins is illustrated 
in Figure 4 (B). In all cases, overexpression of a protein of the expected 
molecular weight is apparent on SDS-PAGE of the crude lysate. 
Purification of the coexpressed proteins under denaturing conditions 
yielded two bands of apparent homogeneity upon analysis by Coomassie- 
stained SDS-PAGE (Fig. 4B). 



Q^l^j lHY Mutants 

Applicants generated mutants of FI3] to test whether 

reconstitution of mDHFR activity by fragment assembly was specific. 
Protein stability can be reduced by changing the side-chain volume in the 
hydrophobic core of a protein'. « » Residue llel 14 of mDHFR occurs in 
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a core b-strand at the interface between F[1,2] and F[3], isolated from the 
active site, lie 1 14 is in van der Waals contact with Ile51 and Leu93 in 
F[1,2r\ We mutated lie 1 14 to Val, Ala. or Gly. Figure 3 (panel B) 
illustrates the results of cotransfbrmation of £ coffwith construct Z-F[1,2] 
and the mutated Z-F[3] constructs. The colonies obtained from 
cotransformation with Z-F[3:lle114Ala] grew more slowly than those 
cotransfomied with Z-F[3] or Z-F[3:lle1 14Val] (see inset to Fig. 3B). No 
colony grov(rth was detected in cells cotransformed with Z-F[3:lle1 14Glyl. 
The number of transformants obtained was not significantly different in 
the case where colonies were observed, implying that cells cotransformed 
with Z-F[1.2] and either Z-F[3]. Z-FI3:lle114Val] or Z-F[3:lle114Ala] have 
an equal survival rate. Overexpression of the mutants Z-FI3:lle1 14X] was 
in the same range as Z-F[3]. as determined by Coomassie-stalned SDS- 

PAGE (data not shown). 

We also compared the relative efficiency of reassembly 

of mDHFR fragments by measuring the doubling time of the 

cotransfomiants in liquid medium. Doubling time in minimal medium was 

constant for all transformants (data not shown). Selective pressure by 

trimethoprim in the absence of IPTG prevented growth of E. coli except 

when transformed with pQE-16 coding for full-length DHFR due to low 

levels of expression in uninduced cells. Induction of mDHFR fragment 

expression with IPTG allowed survival of cotransformed cells (except in 

the case of Z-F[1.21 + Z-F[3:lle1 14Gly]. although the doubling times were 

significantly increased relative to growth in the absence of trimethoprim. 

The doubling time measured for cells expressing Z-F[1,2] + Z-FI3]. Z- 

F[1.2] + Z-F[3:lle114Val] and Z-F[1,2] + Z-F[3:lle114Alal were 16-fold. 

1.9-fold and 4.1-fold, higher respectively, than the doubling time of £ coli 
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expressing pQE-16 in the absence of trimethoprim and IPTG. The 
presence of IPTG unexpectedly prevented growth of £ coli transfomied 
with full-length mDHFR. Growth was partially restored by addition of the 
folate metabolism end-products thymine, adenine, pantothenate, glycine 
5 and methionine (data not shown). This suggests that induced 
overexpression of mDHFR was lethal to £ coli when grown in minimal 
medium as a result of depletion of the folate pool by binding to the 
enzyme. 

In another embodiment, applicants make point 
10 mutations in the GCN4 leucine zipper of Z-F11,21 and Z-F[31. for which 
direct equilibrium and kinetic parameters are known and correlating these 
known values with parameters derived from the PCA (Pelletler and 
Michnick. in preparation). Comparison of cell growth rates in this model 
system with rates for a DHFR PCA using unknowns would give an 
1 5 estimate of the strength of the unknown interaction. This should enable 
the determination of estimates of equilibrium and kinetic parameters for 
a specific protein-protein interaction. 

The present invention has illustrated and demonstrated 
a protein-fragment complementation assay (PCA) based on mDHFR. 
20 where a leucine zipper directs the reconstitution of DHFR activity. Activity 
was detected by an £ coli survival assay which is both practical and 
inexpensive. This system illustrates the use of mDHFR fragment 
complementation in the detection of leucine zipper dimerization and couW 
be applied to the detection of unknown, specific protein-protein 

25 interactions in vivo. 

F, .nii Aminoalvcosid eJiipas r- Ontimi 7 i.tion and pg^lqn Pf a P' 



Ular ^volution strategy 
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Although applicants have demonstrated that the 
engineering/design strategy described above can be used to produce 
complementary enzyme fragments, it is obvious that proteins did not 
evolve in such a way that such fragments would be expected to have 

5 optimal physical characteristics, including solubility, foldability (fast 
folding), protease resistance, or enzymatic activity. An alternative 
embodiment to the engineering/design strategy is the 
endonuclease/evolution approach. This strategy can be used by itself or 
in conjunction with the engineering/design strategy. The advantages of 

10 this approach are that in principle, prior knowledge of the protein 
strucuture is not necessary, that the optimal fragments are chosen for 
PCA and that these fragments will also have optimal characteristics. 
Following selection of optimal complementary fragments, the fragments 
are exposed to multiple rounds of random mutagenesis. Mutagenesis is 

15 acheived by suboptimal PGR combined with chemical mutagenesis or 
DNA shuffling (Stemmer. W. P. C. (1994) Proc, NaU, Acad, Sci. USA 91, 
10747-10751). The overall strategy is described for the case of 
aminoglycoside kinase (AK), an example of antibiotic resistance martter 
that can be used for dominant selection of prokaryotic cells such as E. 

20 coli or eukaryotic cells such as yeast or mammalian cell lines. The 
structure of an AK is already known, and so strategy (1) would be 
possible, however we chose to combine both strategy (1) as defined for 
DHFR above, in conjunction with strategy (2). 
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FVPf;RllVI£'^Tftl PROTOCOL 
The optimization/selection procedure is as follows: 

mtiTrn ? f ?f ''bra"^ ?f A*^ fraar pft"!? based on prPdUCtS Qf 

Nested sets of deletions are created at the 5' and the 3" 
ends of the AK gene. In order to create unidirectional deletions, unique 
restriction sites are introduced in the regions flanking the AK gene. At the 
5- and 3- tennini. an "outer" sticky site with a protruding 3' tenninus (Sph 
I and Kpn I. respectively) and an "inner" sticky site with recessed 3' 
terminus (Bgl II and Sal I. respectively) are added by PGR. Cleavage at 
Sph I and Bgl 11 (or Kpn I and Sal I) results in creation of a protruding 
tenninus leading back to the flanking sequence and a recessed tenninus 
leading into the AK gene. Digestion with E coli exonuclease III and SI 
nuclease (Henikoff. S. (1987) Methods in Enzymology 155. 156-165) 
yields a set of nested deletions from the recessed temiinus only. Thus. 
10 mg of DNA is digested with Sph I and Bgl II (or Kpn I and Sal I), 
phenol-chloroform extracted, and 12.5 U exonuclease III added. At 30 
sec inten/als over 10 min. aliquots are taken and put into solution with 2 
U SI nuclease. The newly created ends are filled in with T4 DNA 
polymerase (0.1 U per sample) and the set of vectors closed back by 
blunt-ended ligation (10 U ligase per sample). The average length of the 
deletion at each time point is detemiined by restriction analysis of the 
sets. This yields sets of AK genes deleted from the 5" or the 3' tennini. 
This manipulation is undertaken directly in the pQE-32-Zipper constructs, 
such that the products can be used directly in activity screening. 



wo 98/34120 



PCT/CA98/00068 



32 



<;i;^ppninq for AK activity 

As a first step in determining the requirements for 
fragment complementation, we must determine the minimum A/-temiinal 
and C-temiinal fragments of AK that, alone, are active. Sets of deletions 

5 are individually transformed into £ coli BL21 cells and expression of the 
AK fragments is induced by IPTG. The sets where a significant number 
of colonies appear in the presence of G418 serve to indicate the 
approximate length of N- and Otenninal AK fragments which retain 
activity. Fragment complementation must therefore be undertaken with 

10 fragments taken from within these limits. The zipper-directed fragment 
complementation is detected as follows: appropriate sets of deletions, or 
pools of sets, are cotransformed into BL21, expression is induced with 
IPTG and growth in the presence of varying G418 concentrations is 
monitored. Large colonies which grow in the presence of high G418 

15 concentrations are selected as giving the most efficiently complementing 
products. 

nirg ftpd evolution of optimal AK fraaements using "DNA shuffling 

After optimal fragments have been selected, the 

20 individual fragments are removed by restriction digestion at Sph I and 
Kpn I allowing for 5' and 3' constant priming regions flanking the A/- or O 
terminal complementary fragments of AK. These oligonucleotides (2-4 
mg) are digested with DNasel (0.005 units/ul, 100 ul) and fragments of 
1 0-50 nucleotides are extracted from low melting point agarose. PGR is 

25 then perfomied with the fragmented DNA, using Taq polymerase (2.5 
units/ul) in a PGR mixture containing 0.2 mM dNTPs, 2.2 mM Mg2GI (or 
0 mM for subuptimal PGR). 50 mM KGI. 10 mM Tris.HCl. pH 9.0. 0,1% 



wo 98/34120 



33 



PCT/CA98/00068 



10 



15 



20 



25 



TritonX-100. A PGR program of 94C/60 sec; 94C 30 sec; 55C 30 sec; 
72C 30 sec times 30 to 50; 72C 5 min. Samples are taken every 5 
cycles after 25 cycles to monitor the appearance of reassembled 
complete fragments on agarose gel. The primerless PGR product is then 
diluted 1:40 or 1:60 and used as template for PGR with 5'. 3" 
complementary constant region oligos as primers for a further 20 cycles. 
Final product is restriction digested with Sph I and Kpn I and the products 
subcloned back into pQE32-Zipper to yield the final library of expression 
plasmids. As before. £. coli BL21 cells are sequentially transfomied with 
C-terminal or W-terminal complementary fragment-expression vectors at 
an estimated efficiency of 109 and finally cells cotransfomied with the 
complementary fragment E. coli are grown on agarose plates containing 
1 mg/ ml G418 and after 16 hours the largest colonies are selected and 
grown in liquid medium at increasing concentrations of G418. Those 
clones showing the maximal resistance to G418 are then selected and if 
maximum resistance or greater is reached the evolution is tem^inated. 
Othenwise the DNA shuffling proceedure is repeated. Finally, optimal 
fragments are sequenced and physical properties and enzymatic activity 
are assessed. This optimized AK PGA is now ready to test for dominant 
selection in any other cell type including yeast and mammalian cell lines. 
This strategy can be used to develop any PGA based on enzymes that 
impart dominant or recessive selection to a drug or toxin or to enzymes 
that produce a colored or fluorescent product. In the later two cases the 
end point of the evolution process is at minimum, reatainment of signal 
for the intact, wild type enzyme or enhancement of the signal. This 
strategy can also be used in the absense of knowledge of the enzyme 
structure, whether the enzyme in mono-, di- or multimeric structure. 
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However, knowledge of the enzyme structure does not preclude applying 
this strategy as well, as described below. 

As can be appreciated, knowledge of the enzyme 
structure can be used to render a more efficient way of using molecular 

5 evolution to design a PCA. In this case, the enzyme structure is used to 
define minimal domains of the protein in question, as was done for DHFR. 
Instead of generating fragments of completely random length for the N- 
or C-terminal fragments, we select, during the exonuclease phase, those 
fragments that at a minimum will code for one of the two domains. For 

10 instance, in the case of AK, two welt defined domains can be discerned 
in the structure consisting of residues 1-94 in the AZ-temninus and residues 
95-267 in the C-terminus. Endonuclease digestions are performed as 
above, but reaction products are selected that will minimally code for one 
of the two domains. These are then the starting points for fragment 

15 selection and evolution cycles as described above. 



Hi>t?r9fP^"g Enzyme PCA 

A further embodiment of the invention relates to PCA 
based on using heterodimeric or heteromultimeric enzymes in which the 

■ 

20 entire catalytic machinery is contained within one independently folding 
subunit and the other subunit provides stability and/or a cofactor to the 
enzymatic subunit. In this embodiment of PCA, the regulatory subunit is 
split into complementary fragments and fused to interacting proteins. 
These fragments are co-transformed/transfected into cells along with the 

25 enzyme subunit. As with single enzyme PCA described for DHFR and 
AK, reconstitution and detection of enzyme activity is dependment on 
oligomerization domaiN-assisted reassembly of the regulatory subunit 
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reassembly into its native topology. However, the reconstituted subunit 
then interacts with the intact enzymatic subunit to produce activity. This 
approach is reminiscent of the USPS system, except it has the advantage 
that the enzyme in this case is not a constitutive cellular enzyme, but 
rather an exogenous gene product. As such there is no problem with 
background activity from the host cell, the enzyme can be expressed at 
higher levels than a natural gene and can also be modified to be directed 
to specific subcellular compartments (by subcloning compartment-spedfic 
signal peptides onto the N- or C-termini of the enzyme and subunit 
fragments). The specific advantage of this approach is that while the 
single enzyme strategy may lead to suboptimal enzymatic activity, in this 
approach, the enzyme folds independently and may in fact act as a 
chaperone to the fragmented regulatory subunit. aiding in its refolding, 
in addition, folding of the fragments may need not be complete in order 
to impart regulation of the enzyme. This approach is realized by a 
colorimetricffluorometric assay we have developed based on the 
Streptomyces tyrosinase. This enzyme catalyzes the conversion of 
tyrosine to deoxyphenylalanine (DOPA). The reaction can be measured 
by conversion of fluorocinyl-tyrosine to the DOPA fomi. The active 
enzyme consists of two subunits, the catalytic domain (Melc2) and a 
copper binding domain (Meld). Meld is a small protein of 14 kD that is 
absolutely required for Melc2 activity. In the assay we are developing, the 
Meld protein is split into two fragments that serve as the 
complementation part of the PCA. These fragments, fused to 
oligomerization domains, are coexpressed with Melc2, and the basis of 
the assay is that Melc2 activity is dependent on complementation of the 
Meld fragments. Stoichiometries of protein complexes can also be 
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addressed (i.e. whether a complex consists of two or three proteins) as 
follows. One fuses two proteins to the two Meld fragments and a third to 
intact Melc2. It thus can be shown that the minimum complementary 
active complex of the tyrosinase will require that all three components 
5 and therefore a trimer is necessary. A key aspect of this approach is that 
we can easily demonstrate specific interactions by making one 
component, specifically the protein-Melc2 fusions catalytic subunit 
dependent on the other components by underexpressing it in the 
background of overexpressed Meld fragment-protein fusions. 

10 

fUliit^im^r Disruption-Based PCA 

Although applicants have described only fragment complementation of 
intact proteins, protein domains or subunits as comprising PCA, an 
alternate enmbodiments relates to PCAs based on the disruption of the 

15 interface between, for instance a dimeric enzyme that requires stable 
association of the subunits for catalytic activity. In such cases, selective 
or random mutagenesis at the subunit interface would dismpt the 
interaction and the basis of the assay would be that oligomerization 
domains fused to the subunits would provide the nessesary binding 

20 energy to bring the subunits together into a functional enzyme. 

\f^r\Qf Design in Application to PCAs 

The PCA strategies listed thus far have used two- 
plasmid transformation strategies for expression of complementary 
25 fragments. This approach has some advantages, such as using different 
drug resistance markers to select for optimal incorporation of genes, for 
instance in transfonned or transfected cells or for optimum transformation 
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of complementary plasmids into bacteria and control of expression levels 
of PCA fragements using different promoters. However, single plasmid 
strategies have advantages in terms of simplicity of transfection/ 
transfomiation. Protein expression levels can be controlled in different 
ways, while drug selection can be achieved in one of two ways: In the 
case of PCAs based on survival assay using enzymes that are drug 
resistance markers themselves, such as AK. or where the enzyme 
complements a metabolic pathway, such as DHFR. no additional drug 
resistance genes need be incorporated in the expression plasmids. If 
however the PCA is based on an enzyme that produces a colored or 
fluorescent product, such as tyrosinase or firefly luciferase. an additional 
drug resistance gene must be expressed from the plasmid. Expression 
of PCA complementary fragments and fused cDNA librariesAarget genes 
can be assembled on single plasmids as individual operons under the 
15 control of separate inducible or constitutive promoters, or can be 
expressed polycistronically. In E. coii polycistronic expression can be 
achieved using known intercoding region sequences, for instance \Ne use 
the region in the mel operon from which we derived the tyrosinase meld- 
melc2 genes which we have shown to be expressed at high levels in £ 
20 CO// under the control of a strong (tac) promoter. Genes could also be 
expressed and induced off of independent promoters, such as tac and 
arabinose. For mammalian expression systems, single plasmid systems 
can be used for both transient or stable cell line expression and for 
constitutive or inducible expression. Further, differential control of the 
25 expression of one of the complementary fragment fusions, usually the 
bait-fused fragment, can be controlled to minimize expression. This will 
be important in reducing background non-specific Interactions. Examples 
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of differential control of complementary fragment expression include the 
following strategies: 

i) In polycistronic expression, transient or stable, expression of the 
second gene will necessarily be less efficient and so this in itself could 
serve to limit the quantity of one of the complementary fragments. 
Alternatively, the first gene product can be limited in expression by 
mutation of an upstream donor/splice site, while the second gene can be 
put under the control of a retroviral internal initiation site, such as that of 
ECMV to enhance expression. 

ii) Individual complementary fragment-fusion pairs can also be put under 
the control of inducible promoters, all comercially available including 
those based on Tet-responsive PhCMV*-1 promoter, and/or steroid 
receptor response elements. In such a system the two complementary 
fragment genes can be turned on and expression levels controlled by 
dose dependent expression with the inducer, in these cases tetracycline 
and steroid hormones. 



EXAMPLE 2 

20 A pplications of the PCA strateffy to detect novel aene products in 
t^it^ Qh^n^'cal Pathways and to m ?p such pathways 

Among the greatest advantage of PCA over other 
molecular interaction screening methods is that they are designed to be 
perfonned both in vivo and in any type of cell. This feature is crucial if the 

25 goal of applying a technique is to identify novel interactions from libraries 
and simultaneously be able to determine if the interactions observed are 
biologically relevant. The detailed example given below, and other 
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examples at the end of this section illustrate how it is that validation of 
interactions with PCA is possible. In essence, this is achieved as follows. 
In biochemical pathways, such as honnone receptor-mediated signaling, 
a cascade of enzyme-mediated chemical reactions are triggered by some 
molecular event, such as by homione binding to its membrane surface 
receptor. Enzyme interactions with protein substrates and protein-protein 
or protein-nucleic acid interactions with enzyme-modified substrates then 
occur. Such biochemical signaling cascades only occur in specific cell 
types and model ceil lines for studying these processes. Therefore, to 
detect induced interactions, such as with known proteins in a pathway 
with yet unidentified proteins, one obviously needs to perform such 
screening in appropriate model cell lines and in the correct cellular 
compartment. Only the PCA strategy can be used in a general way to do 
this. Protein-molecular interaction techniques such as yeast two- or 
three-hybrid techniques cannot be perfomied in a context where such 
events occur, except in the limiting case of nuclear interaction in yeast or 
interactions that are not triggered. There do exist mammalian two-hybrid 
techniques where it might be possible to detect induced protein 
interactions, but only again if the proteins involved can be simultaneously 
20 activated, transported to the nucleus and interact virith their partners. 
PCAs do not have these limitation since they do not require additional 
cellular machinery available only in specific compartments. A further 
point is that by performing the PCA strategy in appropriate model cell 
types, it is also possible to introduce appropriate positive and negative 
25 controls for studying a particular pathway. For instance, for a hormone 
signaling pathway it is likely that honnone signaling agonists and 
antagonists or dominant-negative mutants of signaling cascade proteins 
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would be known, that are upstream or act in parallel to the events being 
examined In the PCA. These reagents could be used to detennine If 
novel interactions detected by the PCA are biologically relevant. In 
general then, interactions that are detected only If hormone is introduced 
5 but are not seen if an antagonist is simultaneously introduced could be 
hypothesized to represent interactions relevant to the process under 
study. 

Below is a detailed description of an application of the 
DHFR that illustrates these points, as well as further examples where the 
10 PCA strategy could be used. 

/^pp |j. ;^ ^ti"" P^the DHFR PCA to Mapping Growth Factor-Medlated 
gjqi^gl Tr?"sduction Pathwavs 

One of the earliest detectable events in growth factor- 

1 5 activated cell proliferation is the serine phosphorylation of the S6 protein 
of the 40S ribosomal subunit. The discovery of serine/threonine kinases 
that specifically phosphorylate S6 have considerably aided in identifying 
novel mitogen mediated signal transduction pathways. The 
serine/threonine Icinase p70S6k has been identified as a specific S6 

20 phosphorylase^^^ '^. p70S6k is activated by serine and threonine 
phosphorylation at specific sites in response to several mitogenic signals 
including serum in semm stan/ed cells, growth factors including insulin 
and PDGF, and by mitogens such as phorbol esters. Considerable effort 
has been made over the last five years to detemiine how p70/p85S6k are 

25 activated in response to mitogens. Two receptor-mediated pathways have 
been implicated in p70S6k activation, one associated with the 
phosphatidylinositol-3-kinase (Pl(3)k) and the other with the Pl(3)k 
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homologue mTOR^""'''. Key to understanding of this proposal, is the fact 
that the role of these enzymes in activation of p70S6k was determined by 
effects of two natural products on phosphorylation and enzynf»e activity: 
rapamycin. which indirectly inhibits mTOR activity, and wortmannin, which 
directly inhibits Pl(3)k activity. It is also important to note that no direct 
upstream kinases or other regulatory proteins of p70S6k have been 

identified to this date. 

The interactions of p70S6k with its known substrate S6 

can be studied as a test system for the DHFR PCA in £. coli and in 
mammalian cell lines. One can also seek to identify novel interactions 
with this enzyme that would lead to new insights into how this important 
enzyme is regulated. Also, since activation of the enzyme is mediated by 
multiple pathways that can be selectively inhibited with specific drugs, this 
is an ideal system to test PCAs as methods to distinguish induced versus 
15 constitutive protein-protein interactions. 

^ipr , nf the g. CQ ii « Mn/ival ass? Y: '"fraction of P70S6K With $g 
This test is ideal, because the apparent Km {= 250 nM) 
of p70S6k for S6 protein'** is approximately the same as the Kd for 
20 leucine zipper-fbmiing peptides from GCN4 used in our test system. 
However, we will have to use a constitutively active form of the enzyme 
for our tests. An W-terminal truncated form of the enzyme D77-p70S6k. 
is constitutively active and will be used in these studies147. 
Methodology: D77-p70S6k-F[1 .2] fusion and D77-p70S6k-Fl3] fusion, or 
25 Fll .2] and D77-p70S6k-F[3] fusion (as a control) will be cotransfomied 
into £ coli and the cells grown in minimal medium in the presence of 
trimethoprim. Colonies will be selected and expanded for analysis of 
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kinase activity against 40S ribosomal subunits, and for coexpression of 
the two proteins. 

h ) itfinrijfieation of the bacterial survival assay for library screening; 

5 Irientification nf Novel intt^raetinff Proteins 

Screening an expression library for interactions with a 
given target (p70S6k-D77. in this case) will be straightforward in this 
system, given that the only steps involved are: 1 -construction of the 
fusion-expression library as a fusion with mDHFR fragment[3]; 2- 

10 transfomiation of the library in £ coli Bi2^ harboring pRep4 (for 
constitutive expression of the lac repressor; this is required in the case 
where a protein product is toxic to the cells) and a plasmid coding for the 
fusion: p70S6k-D77-[1,2]; 3-plating on minimal medium in the presence 
of trimethoprim and IPTG; 4-selection of any colonies that grow, 

15 propagation and isolation of plasmid DNA, followed by sequencing of 
DNA inserts; 5-purification of unknown fusion products via the hexaHis- 
tag and sizing on SDS-PAGE. 

Methodology; 

20 The overall strategy is illustrated in Figure 5. 1- 

Constmction of a directional fusioN-expression library: i-cDNA production: 
One can isolate poly(A)+ RNA from BA/F3 cells (B-lymphoid cells) 
because these ceils have successfully been used in the study of the 
rapamycin-sensitive p70S6k activation cascade"'. To enrich for full- 

25 length mRNA, we will affinity purify the mRNA via the 5* cap structure by 
the CARure method'**. Reverse transcription will be primed by a "Linker 
Primer": it has a poly(T) tail to prime from the poly(A) mRNA tail, and an 
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Xhol site for later use in directional subcloning of the fragments. The first 
strand is then methylated. After second strand synthesis and blunting of 
the products. "EcoRI Adapters" are added, pruducing digestion of the 
linkers with EcoRI and Xhol (the inserts are protected by methylation) 
produces full-length cDNA ready for directional insertion in a vector 
opened with EcoRI and Xhol. Because the success of library screening 
depends largely on the quality of the cDNA produced, we will use the 
above methods as they have proven to consistently produce high-quality 
CDNA libraries, ii-lnsertion of the cDNA into vectors: The library will be 
constructed as a C-terminal fusion to mDHFR F[3] in vector pQE-32 
(Qiagen). as we have obtained high levels of expression of mDHFR 
fusions from this vector in BL21 cells. Three such vectors will be created, 
differing at their 3' end. which is the novel polycloning site that we 
engineered (described eariier. under Methods), carrying either 0. 1 . or 2 
additional nucleotides. This allows read-through from F[31 into the library 
fragments in all 3 translational reading frames. The cDNA fragments will 
be directionally inserted at the EcoRI and Xhol sites in all three vectors 
at once. 2, 3. 4. and 5- These steps have been described eariier. under 
Results, apart from the final sequencing of clones identified using 
sequencing primers specific to vector sequences flanking sites of library 
insertion. The protein purification will also be as described eariier, by a 
one-step purification on Ni-NTA (Qiagen). If the product size is more than 
15 kOa over the molecular weight of the DHFR component (equal to a 
cDNA insert of more than 450bp). we will have the inserts sequenced at 
25 the Sheldon Biotechnology Center (McGill University). 
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r) npy^l ppment of the Eukarvotic Assay 

The transformation of the system described above, is 
useful to produce an equivalent assay for use in eultaryotic cells. The 
basic principle of the assay is the same: the fragments of mDHFR are 

5 fused to associating domains, and domain association is detected by 
reconstitution of DHFR activity in eukaryotic cells (Figure 5). 
Creation of the expression constructs: The DNA fragments coding for 
the GCN4-zipper-mDHFR fragment fusions were inserted as one piece 
into pMT3. a eultaryotic transient expression vector^*. Expression of the 

1 0 fusion proteins in COS cells was apparent on SDS-PAGE after 35[SlMet 
labeling. 

Survival assays in eul<aryotic cells: Two systems can be used for 
detection of mDHFR reassembly, in parallel: i- CHO-DUKX B11 cells 
(Chinese Hamster Ovary cell line deficient in DHFR activity) are 

1 5 cotransfected with GCN4-zipper-mDHFR fragment fusions. The cells are 
grown in the absence of nucleotides; only cells carrying reconstituted 
DHFR will undergo normal cell division and colony formation, ii- 
Methotrexate (MTX)-resistant mutants of mDHFR have been created, with 
the goal of transfecling cells that have constitutive DHFR activity such as 

20 COS and 293 cells. We mutated F(1 ,2] in order to incorporate, one at a 
time, each of five mutations that significantly increase Ki (MTX): 
Gly15Trp, Leu22Phe. Leu22Arg, Phe31Ser and Phe34Ser (numbering 
according to the wild-type mDHFR sequence). These mutations occur at 
varying positions relative to the active site and relative to FI3], and have 

25 varying effects on Km (DHF), Km (NADPH) and Vmax of the full-length 
mammalian enzymes in which they were. Mutants Z-F(1,2: Leu22Phel, 
Z-F[1,2: Leu22Arg] and Z-F[1,2: Phe31Ser] all allowed for bacterial 
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survival with high growth rates when cotransformed with Z-F[3] (results 
not shown).The five mutants will be tested in eukaryotic cells, in 
reconstitution of mDHFR fragments to produce enzyme that can sustain 
COS or 293 cell growth while under the selective pressure of MTX. which 
will eliminate background due to activity of the native enzyme. The 
mutations offers an advantage in selection while presenting no apparent 
disadvantage with respect to reassembly of active enzyme. If the 
reconstituted mDHFR produced in either of the survival assays allows 
eukaryotic cell growth that is significantly slower than growth with the wiW- 
type enzyme, thymidylate will be added to the growth medium to partially 
relieve the selective pressure offered by the lack of nucleotides. 
9f ^'^^ euka''y°*'C survival ass; 

It is necessary at the outset to test whether induced 
interactions with p70S6k can be detected. One can use the same test 
system as that for the £ coli test system described above: Induction of 
association of p70S6k with S6 protein. 
IVlPthodoloav; 

mDHFR Leu22Phe mutant S6-F11.2] and p70S6k-Fl3]. or F11,21 and 
p70S6k-F[31 (as a control) will be cotransfected into COS cells and the 
cells will be serum starved for 48 hours followed by replating of cells at 
low density in serum and MTX. Colonies will be selected and expanded 
for analysis of kinase activity against 40S ribosomal subunits. and for 
coexpression of the two proteins. Further controls will be perfomied for 
inhibition of protein association with wortmannin and rapamycin. 




An important part of the work required in creating a 
library for use in eukaryotic cells will have been accomplished already, as 



wo 98/34120 



PCT/CA98yO0O68 



46 



the EcoRi/Xhol directional cDNA produced by the Stratagene "cDNA 
Synthesis Kit" can directly be inserted directionally into the Stratagene 
Zap Express vector. 

MethodoloQV: 

5 Steps 1 through 5 are parallel to those for the bacterial 

library screening (above). 1 -Again, the library is constructed as a C- 
terminal fusion to mDHFR F[3]. F[3] (with no stop codon) will be inserted 
in frame in Zap Express, followed by insertion of the novel polylinkers 
allowing expression of the inserts in all three reading frames (described 

10 above), and by the EcoRI/Xhol directional cDNA. This bacteriophage 
library will be propagated and treated with the Stratagene helper phage 
to excise a eukaryotic expression phagemid vector (pBK-CMV) carrying 
the fusion inserts. 2-Cotransfection of the library and p70S6k-F[1,2] 
constructs in eukaryotic cells: we will perfomi the screening in COS or 

15 293 cells, as these are responsive to serum in activating the p70S6k 
signaling pathway. Selection experiments will fc>e perfonned as described 
for the S6 test system above. 3-Propagation, isolation and sequencing 
of the insert DNA will be undertaken. 4-The cloned fusion proteins will be 
sized on SDS-PAGE by direct visualization after 35S-Met/Cys labeling, 

20 or by Western blotting using a commercial polyclonal antibody to mDHFR. 
Generalization of the Strategy: The scheme for detecting partners for 
the protein p70S6k can be applied to studies of any biochemical pathway 
in any living organism. Such pathways may also be related to disease 
processes. The disease-related pathway may be an intrinsic process of 

25 cells in humans where a pathology arises from, for instance mutation, 
deletion or under or over expression of a gene. Alternatively the 
biochemical pathway may be one that is specific to a pathogenic 



wo 98/34120 



47 



PCT/CA98/00068 



organism or the mechanism of host invasion. In this case, component 
proteins of such processes may be targets of a therapeutic strategy, such 
as development of drugs that inhibit invasion by the organism or a 
component enzyme in a biochemical pathway specific to the pathogenic 
5 organism. 

Inflamatory diseases are a case in point that can 
concern both examples. The protein-protein interactions that mediate the 
adhesion of leukocytes to inflamed tissues are known to involve such 
proteins as vascular cell adhesion molecule-1 (VCAM-1), and certain 

10 cytokines such as IL-6 and IL-8 that are produced during inflammation. 
Hov.'3ver, many of the proteins involved in onset of inflammatory 
response remain unknown; further, the intracellular signaling pathways 
triggered by the extracellular associations are poorly understood. The 
PCAs could be used in elucidation of the mechanisms underiying the 

15 onset of inflammation, as well the ensuing signaling. For example, 
signaling pathways associated with inflamation, such as those mediated 
by IL-1 , IL-6, IL-8 and tumor necrosis have been studied in some detail 
and many direct and downstream regulators are known. These 
regulators can be used as starting point targets in a PCA screening to 

20 klentify other signalling or modulating proteins that could also be targets 

for drug development. 

There is an increased risk of infection by enteric 

pathogens in the occurrence of the intestinal inflammation that 
characterizes idiopathic intestinal diseases. There are two mechanisms 
25 which need to be better understood here and virtiich can be addressed by 
PCA: 

i- the cellular mechanisms of inflammation as described above, and 
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ii-the discovery of the specific cell-surface ligands which the pathogenic 
organisms recognize and associate with. Secreted proteins produced by 
the pathogen can bind to the basolateral membrane of epithelial cells (as 
in the case in Yersinia pseudotuberculosis infection) or be translocated 
into intestinal epithelial cells {Salmor^ella infection), promoting Infectivity 
and/or physiological responses to the infection. However, in most cases 
the interactions between the pathogenic protein and the epithelial cells 
are unknown. 

Cell adhesion and nervous system regeneration A related example in 
cell adhesion includes processes involved in develoment and 
regeneration in the nervous system. Cadherens are membrane proteins 
that mediates calcium dependent cell-cell adhesion. To do so they need 
another class of cytoplasmic proteins called cathenins. Those make a 
bridge between cadherins and cytoskeleton. Cathenins are also regulate 
genes that control differentiation-specific genes. For instance, the protein 
B-cathenin can interact in certain situation with a transcription factor (lef- 
1) and be translocated into the nucleus where it constrains the number of 
genes transactivated by lef-1 (differentiation). This process is regulated 
by the Wnt signaling pathway (homologs to the wingless pathway in 
drosophiia) by inactivation of GSK3B which permit degradation after of 
APC (a cytoplasmic adapter protein). PCA strategies could be used to 
identify novel proteins involved in the regulation of these processes. 

Proteins involved in viral integration processes are 
examples of targets that could be tested for inhibitors using the PCA 
strategies. Examples for the HIV virus include: 



wo 98/34120 



PCT/CA98/00068 



49 



i) inhibition of integrase or the transport of the pre-integration complex: 
protein Ma or vpr. 

ii) Inhibition of the cell cycle in G2 by vpr (interaction by cyclin B) causing 

induction of apoptosis. 
5 iii) Inhibition of the interaction of gp160 (precursor of the membrane 

proteins) with furine. 

Accessory proteins of HIV as a therapeutic target: 

i) Vpr: nuclear localizing sequence (target): interaction site of vpr with 

10 phosphatasesA . 

ii) vif: interaction with vimentin (cytoskeleton associated protein). 

ii) Vpu: Degradation of CD4 in the RE mediated by the cytoplasmic tail of 

Vpu. 

iii) nef: Myristoylation signal of Nef. 



15 



20 



FXAMPLE 3 

(> th«>r ExaniDl '^'^ »f ?rnte\n Fragment Copipli 

Other examples of assays are herein examplified. The 
reason to produce these assays is to provide alternative PCA strategies 
that would be appropriate for specific protein association problems such 
as studying equilibrium or kinetic aspects of assembly. Also, it is possible 
that in certain contexts (for example, specific cell types) or for certain 
applications, a specific PCA will not wori< but an alternative one will. 
Further below are brief descriptions of each other PCAs embodiments. 
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1) Glutathione-S-Transferase (GST) GST from the flat worm 
Schistosoma japonicum is a small (28 kD), monomeric, soluble protein 
that can be expressed in both prokaryotic and eukaryotic cells. A high 
resolution crystal structure has been solved and serves as a starting point 

5 for design of a PCA. A simple and inexpensive colorimetric assay for 
GST activity has been developed consisting of the reductive conjugation 
of reduced glutathione with 1-chloro-2,4-dinjtrobenzine (CNDB), a brilliant 
yellow product. We have designed a PCA based on similar structural 
criteria used to develop the DHFR PCA using GCN4 leucine zippers as 

10 oligomerization domains. Cotransformants of zipper-GST-fragment 
fusions are expressed in E. coli on agar plates and colonies are 
transferred to nitrocellulose paper. Detection of fragment 
complementation is detected in an assay where a glutathione-CDNB 
reaction mixture is applied as an aerosol on the nitrocellulose and 

15 colonies expressing co-expressed fragments of GST are detected as 
yellow images. 

2) Green Fluorescent Protein (GFP) GFP from Aequorea victoria is 
becoming one of the most popular protein markers for gene expression, 

20 This is because the small, monomeric 238 amino-acids protein is 
intrinsically fluorescent due to the presence of an internal chromophore 
that results from the autocatalytic cyclization of the polypeptide backbone 
between residues Ser65 and Gly67 and oxidation of the - bond of Tyr66. 
The GFP chromophore absorbs light optimally at 395 nm and possesses 

25 also a second absorption maximum at 470 nm. This bi-specific absorption 
suggests the existence of two low energy confomiers of the chromophore 
whose relative population depends on local environment of the 
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chromophore. A mutant Ser65Thr that eliminates isomerization (single 
absorption maximum at 488 nm) results in a 4 to 6 times more intense 
fluorescence than the wild type. Recently the structure of GFP has been 
solved by two groups, making it now a candidate for a strucutre-based 

5 PCA-design. which we have begun to develop. As with the GST assay, 
we are doing all of our initial development In £ co// with GCN4 leucine 
zipper-fomiing sequences as oligomerization domains. Direct detection 
of fluorescence by visual observation under broad spectrum UV light will 
be used. We will also test this system in COS cells, selecting for co- 

10 transfectants using fluorescence activated cell sorting (FACS). 

3) Fire Fly Luclferase. Firefly luciferase is a 62 kDa protein which 
catalyzes oxidation of the heterocycle luciferin. The product posesses 
one of the highest quantum yields for bioluminescent reactions: one 

15 photon is emitted for every oxidized luciferin molecule. The structure of 
luciferase has recently been solved, allowing for strucutre-based 
development of a PCA. As with our GST assay, cells are grown on a 
nitrocellulose matrix. The addition of the luciferin at the surface of the 
nitrocellulose pemiits it to diffuse across the cytoplasmic membranes and 

20 trigger the photoluminescent reaction. The detection is done immediately 
on a photographic film. Luciferase is an ideal candidate for a PCA: the 
detection assays are rapid, inexpensive, very sensitive, and utilizes non- 
radioactive substrate that is available commercially. The substrate of 
luciferase. luciferin. can diffuse across the cytoplasmic membrane (under 

25 acidic pH). allowing the detection of luciferase in intact cells. This 
enzyme is currently utilized as a reporter gene in a variety of expression 
systems. The expression of this protein has been well characterized in 
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bacterial, mammalian, and in plant cells, suggesting that it would provide 
a versatile PCA. 

4) Xanthine-guanine phosphoribosyl transferase (XGPRT) The £. coli 
enzyme XGPRT converts xanthine to xanthine monophosphate (XMP), 
a precursor of GMP. Because the mammalian enzyme hypoxanthine- 
guanine phosphoribosyl transferase HGPRT can only use hypoxanthine 
and guanine as substrates, the bacterial XGPRT can be used as a 
dominant selection assay for a PCA for cells grown in the presense of 
xanthine. Vectors expressing XGPRT confer the ability of mammalian 
cells to grow in selective medium containing adenine, xanthine, and 
mycophenolic acid. The function of mycophenolic acid is to inhibit de 
novo synthesis of GMP by blocking the conversion of IMP into XMP 
(Chapman A. B., (1983) Molec. & Cellul. Biol. 3. 142M429). The only 
GMP produced then come from the conversion of xanthine into XMP, 
catalyzed by the bacterial XGPRT. As with aminoglycoside 
phosphotransferase fragments of XGPRT can be generated based on the 
known stmcture (See table 1.) using the design-evolution strategy 
described above with fragments fused to the GCN4 leucine zippers as a 
test oligomerization domains. The complementary fusions are 
cotransfeded and the proteins transiently expressed in COS-7 cells, or 
stability expressed in CHO cells, grown in the selective medium. In the 
case of CHO cells, colonies are collected and sequentially re-cultured at 
increasing concentrations of the selective compounds in order to enrich 
for populations of cells that efficiently express the fusions at high 
concentrations. 
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5) Adenosine deaminase Adenosine deaminase (ADA) is present in 
minute quantities in virtually all mammalian cell. Although it is not an 
essential enzyme for cell growth. ADA can be used in a dominant 
selection assay. It is possible to establish growth conditions in which the 
cells require ADA to survive. ADA catalyzes the irreversible conversion 
of cytotoxic adenine nucleosides to their respective nontoxic inosine 
analogues. By adding cytotoxic concentrations of adenosine or cytotoxic 
adenosine analogues such as Q-b-D-xylofuranosyladenine to the cells. 
ADA is required for cell growth to detoxify the cytotoxic agent. Cells that 
incorporate the ADA gene can then be selected for amplification in the 
presence of low concentrations of 20-deoxycofomiycin. a tight-binding 
transiUon state analogue inhibitor of ADA. ADA can then be used for a 
PCA based on cell survival (Kaufman. R. J. et al. (1986) Proc. of the Nat. 
Acad. Sci. (USA) 83. 3136-3140). As with the other systems described 
above, fragments of ADA can be generated based on the known structure 
(See table 1 .) using the design-evolution strategy described above with 
fragments fused to the GCN4 leucine zippers as a test oligomerization 
domains. The complementary fusions are cotransfected and the proteins 
transiently expressed in COS-7 cells, or stability expressed in CHO cells, 
grown in the selective medium containing 20-deoxycofomiycin. In the 
case of CHO cells, colonies are collected and sequentially re-cultured at 
increasing concentrations of 20-deoxycofomiycin in order to enrich for 
populations of cells that efficiently express the fusions at high 
concentrations 

6) Bleomycin binding protein (zeocin resistance gene) Zeocin. a 
member of the bleomycin/phleomycin family of antibiotics, is toxic to 
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bacteria, fungi, plants, and mammalian cells. The expression of the 
zeocin resistance gene confers resistance to bleomycin/zeocin. The 
protein confers resistance by binding to and sequestering the drug and 
thus preventing its association and hydrolysis of DNA. Berdy, J. (1980) 

5 In Amino Acid and Peptide Antibiotics, J. Berdy, ed. (Boca Raton, FL: 
CRC Press), pp.459-497; Mulsant, P., Tiraby, G., Kallerhoff, J., and 
Perret, J. (1989 Somat. Cell. Mol. Genet. 14, 243-252). Bleomycin binding 
protein (BBP) could then be used for a PCA based on cell survival. As 
with the other systems described above, fragments of ADA can be 

1 0 generated based on the known structure (See table 1 .) using the design- 
evolution strategy described above with fragments fused to the GCN4 
leucine zippers as a test oligomerization domains. The BBP is a small (8 
kD) dimer that binds to drugs via a subunit interface binding site. For this 
reason, the design would be somewhat different in that first, a single 

1 5 chain fomi of the dimer would be generated by making a fusion of two 
BBP genes with a short sequence coding for a simple polypeptide linker 
introduced between the two subunits. Fragments in this case will be 
based on a short sequence of one of the subunit modules, while the other 
fragment will be composed of the remaining sequence of the subunit plus 

20 the other subunit. Complementation and selection experiments will be 
perfomied as described for the examples above using bleomycin or 
zeocin as selective drugs. 

7) Hygromycin-B-phosphotransferase The antibiotic hygromycin-B is 
25 an aminocyclitol that inhibits protein synthesis by disrupting translocation 
and promoting misreading. The E. coli enzyme hygromycin-B- 
phosphotransferase detoxifies the cells by phosphorylating Hygromycin- 
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B. When expressed in mammalian cells, hygromycin-B- 
phosphotransferase can confer resistance to hygromycin-B ( Gritz. L. 
and Davies. J. (1983) Gene 25. 179-188.). The enzyme is a dominant 
selectable marker and could be used for a PCA based on cell survival. 

5 While the structure of the enzyme is not known it is suspected that this 
enzyme is homologous to aminoglycoside kinase (Shaw, et al. (1993) 
Microbiol. Rev. 57. 138-163). It is therefore possible to use the combined 
design/evolution strategy to produce a PCA with this enzyme and perfonn 
dominant selection in mammalian cells with selection at increasing 

10 concentrations of hygromycin B. 

8) L-histldinol NAD+oxydoreductase The hisD gene of Salmonella 
typhimuhum codes for the L-histidinol NAD+oxydoreductase that 
converts histidinol to histidine. Mammalian cells grown in media lacking 
histidine but containing histidinol can be selected for expression of hisD 
(Hartman. S. C. R. C. Mulligan (1988) Proc. of the Nat. Acad. Scl. (USA) 
85, 8047-8051). An additional advantage of using hisD in dominant 
selection is that histidinol is itself toxic, inhibiting the activity of 
endogenous histidyl-tRNA synthetase. HistWinol is also inexpensive and 
readily pemieates cells. The structure of histidinol NAD+oxydoreductase 
is unknown and so development of a PCA based on this enzyme is based 
entirely on the exonuclease fragment/evolution strategy. 

The following Table list alternative embodiments using other PCA 
reporters. Abreviations in Table: Type: D, dominant selection marken R. 
recessive selection marker. Structure: four letter codes= Protein Data 
Bank (PDB) entries; K. known but not deposited in PDB; U. unknown, 
mono/oligo: M, monomer. D. dimer; tetra, tetramer. 
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TABLE 1. A list of Other Potential PCA Reporter Candidates 
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A-Assays based on Dominant or Recessive Selection 



Enzyme 



DHFR 



Adenosine deaminase 



Thymidine kinase 

Mutant hypoxanthine- guanine 
phosphoribosyl transferase 



Thymidylate synthetase 

Xanthine-guanine 
ph osphoribosyl transferase 

Glutamine synthetase 

Asparagine synthetase 



Puromycin N- 
acetyttransferase 

Aminoglycoside 
phosphotransferase 



Hygromycin B 
phosphotransferase 

L-histidinol:NAD+ 
o xidoreductase 

Bleomycin binding protein 

Cytosine methyl-transferase 



06-allcyiguanine 
alkyltransferase 



Glycinamide ribonucleotide 
transformylase 
Glycinamide ribonucleotide 
synthetase 



D/R 



Structure 

niany 

1ADD 



1K1N 
1HGM 



1NJE 
1NUL 



35kd 



1ADN 



1GRC 



U 



mono/ 

ollgo 



M 



M 



M 



D 



Selection 
drugs/Conditions 

methotrexate/trlmetho 
prim 

Xyl-A or 

adenosine.alanosine, 
and 2'- 

deoxycoformycin 



gangcyclovene, HAT 

HAT + thymidine 
kinase 



2 fluorodeoxyuridin e 

mycophenolic acid 
with limiting xanthine 



B-aspartyl 
hydRDxamate or 
alblzin 



puromycin 

neomycin. G418, 
gentamycin 



hygromycin B 
histidinol 

bleomycin/zeocin 

S-Azacytidine (S-aza- 
CR) and 5-aza-2'- 
deoxycytidine 

nitrosourea 



dideazatetrahydrofolat 
e. minus puri ne 

minus purine 
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Enzyme 


Type 


Structure 


Size 


mono/ 
oligo 1 


Selecuon 
druQS/Conditions 


Phosphoribosyl- 
ominnimidazQle synthetase 


K 


u 


kP 




minus purine 


Formylglycinamide ribotide 
amidotransferase 


R 




141. 
4kD 


M 


1 -a7a^^rine 6-diaZ0~ 1 
5-oxo-L-nor1eucine. 
minus purine 


Phosphoribosyl- 
aminoimldazole carboxylase 

Phosphoribosyl- 
aminoimidazole carboxamide 

formvltransferase 


R 


u 


39.5 
kP 


D 


minus purine 


R 


u 


57.3 
KD 




[minus purine 


Fatty acid synthase 


R 




272k 
D 


D 


Icerulenin 


IMP dehydrogenase 


R 


" 1AK5 


" 55.4 

kD _ 


Tetra 


1 mycophenolic acid 



15 li-Viral Plaque Assays 



20 




Thioredoxin 
Reverse transcriptase 
Viral protease 



Type 


Structure 


D 


1TDF 


D 


3HVT 


D 





Size 

3T5kb 



Mono/ 
^ligo 

D 



Selection 
d rugs/Conditions 



25 



30 



B-Cell Death Assays 




Enzyme 



Cysteine protease: papain 

Cysteine protease: 
caspase 



Metailoprotease: 
carboxy peptidase 

Serine protease: 
proteinase K 



Type 



Structure 



D 



1STF 
1CP3 



Size 



1PTK 



47.1 kD 



30.6kD 



Hetero 
D 




Selection 
drugs/Conditions 

inhibited by cystatin 
inhibited by DEVD-aldehyde 
(can also by used in a 
fluorinnetric or colorinietric 
assay, in vitro) 
inhibited by methyl-ethyl 
succinic acid 

inhibited by serplns 
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Aspartic protease: pepsin 



Lysozyme 



Phosphollpase A2 



Phospholipase C 



D 



many 



1P2P 



1AH7 



23.2kD 




13.8kD 



28kD 



M 



M/D 



M 



Selection 
drugs/Conditions 

inhibited by pepstatin A (can 
also be used in an 
fluorinietric assay, in vitro) 



inhibited by N- 
acetylglucx>samine 
trisaccharide 

inhibited by RNAs e inhibitor 

Inhibited by actn 

many inhibitors: 
bromophenacyl bromide, 
hexadecyl-trifluoroethyl- 
g ly cero-phosphonnethanol, 
bronfX)enol lactone, etc. 

many inhibitors: neomycin, 
chelerythrine. U73122. etc. 



1 0 C-Colorimetric/Fluorimetric Assay 



Enzyme 



DT-Diaphorase (NAD(P)H- 
[qulnone acceptor] 
oxidoreductase) 



(NAD(P)H-(quinone 
acceptor) oxidoreductase)-2 

Themwphilic diaphorase 
(Badlius 

stearothermophilus) 
Glutathione-S-transferase 



Luciferase 



Green-fiuorescent protein 



Structure 



1QRD 



isoform of 
1QRD 



1GNE 



1LCI 



1EMA 



Size 



26kD 



21kD 



30kD 



26kD 
other 
isofomi 
of 26kD 



62kD 



30kD 



Mono/ 
Oligo 



M 



M 



Selection drugs/Conditions 



NADPH-diaphorase stain, 
inhibited by dicumarol, 
Cibacron blue and pbenklbne 
Note: can also be used in a ced 
death assay 
(+nitrobenzimidazole, fo 
example). 

NRH-diaphorase stain. 
Inhibited by 
pentahydroxyfiavone 

NADH-diaphorase stain 



production of a yellow product 
by the conjugation of 
glutathione with an aromatic 
substance, chkm 
dinitrobenzene (CDNB) 



Fluorometric 



Intrinsic fluorescence 
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Enzyme 

Chloramphenicol 
acetyltransferase 



Unease 

SEAP (secreted forni of 
human placental alkaline 
phosphatase) 



B-Glucuronidase 



Structure 

1CLA 



1AJA 



1BHG 



Size 

25kD 

32kD 



71kD 



Mono/ 
Oligo 

Tri 

Tetra 
M 



Tetra 



Selection drugs/Conditions 

Fluorimetric: Bodipy 
chk)ramphen k»[ 

Ftuorometric 
CSPD chemiluminescent 
substrate 



HIstochemical. fiuorometrk: or 
spectrophotometric assays 
using various substrates such 
as X-GLUC. 



10 



D-Heteromeric Enzyme Strategies 

Tyrosinase 





30kD + 


Hetero 




14kD 


M+M 



CokMimetric: synthesis of 

melanin 



15 



20 



25 



EXAMPLE 4 
Variants of PCA to detec tjm 
^ns /protein R MA^P''otein-rirna complews 

To this point specific examples have only been made of 
applications of PCA to protein-pair interactions. However, it Is possible 
to apply PCA to multiprotein. protein-RNA, protein-DNA or protein-small 
molecule interactions. There are two general schemes for achieving such 
systems. Multl-subunit PCA: Two proteins need not interact for a PCA 
signal to be obsen/ed; if a partner protein or protein complex binds to two 
proteins simultaneously. It is possible to detect such a three protein 
complex. A multusubunit PCA is conceived with the example of herpes 
simplex virus thymidine kinase (TK). a homodimer of 40 kD . In this 
conception, the TK stmcture contains two well defined domains consisting 
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of an alphabets (residues 1-223) and an alpha-helical domain (224-374). 
As a test system, we use the Ropi dimer, a four helix bundle homodimer. 
The two fragments of TK are extracted by PCR and subcloned into the 
transient transfection vector pMT3, the first in tandem to the Adenovisus 

5 major late promoter, tripartite leader 3' to the first ATG, and the second 
downstream of a ECMV internal initiation site. Restriction sites previously 
introduced between the first and the last ATG are subcloned into BamHI/ 
Kpnl and Pstl/EcoRI cloning sites downstream of the two ATGs. These 
are used to subclone PCR-generated fragments of the Ropi subunits into 

10 two different vectors. Subsequently Ltk- cells are cotransfected by 
lipofection with the two plasmids and colonies of surviving cells are 
serially selected in medium containing increasing concentrations of HAT 
(hypoxanthine/ aminopterin/thymidine). Cells that express 
complementary fragments of TK fused to the four Rop1 will proliferate 

1 5 under this selective pressure, or othenvise die. Specific examples of use 
of this concept would be in determining constituents of multiprotein 
complexes that are formed transiently or constitutlvely in cells. 

The utility of RCA is not limited to detecting protein- 
protein interactions, but can be adapted to detecting interactions of 

20 proteins with DNA, RNA, or small molecules. In this conception, two 
proteins are fused to RCA complementary fragments, but the two proteins 
do not interact with each other. The interaction must be triggered by a 
third entity, which can be any molecule that will simultaneously bind to the 
two proteins or induce an interaction between the two proteins by causing 

25 a conformational change in one or both of the partners. Two examples 
have been demonstrated in our lab using the mDHFR RCA in £. coli. In 
the first case a natural product, the immunosuppressant drug rapamycin. 
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is used to induce an interaction between its receptor FKBP12 and a 
partner protein mTOR ( mammalian Target of Rapamycin). We detect 
this by cotransfonnation of DHFR fragments fused to FKBP or mTOR into 
£ coli grown in the presence or absence of trimethoprim (as described 
above) and rapamycin (0- 10 nM). We have demonstrated that support 
of growth as detected by colony fomiation is completely dependent on the 
addition of rapamycin. suggesting that the mDHFR PGA is detecting a 
rapamycin-induced assembly of a FKBP12-mT0R and subsequent 
reconstitution of DHFR activity. This is one example of a use of the PCA 
strategy to test for small molecules that can induce interactions between 
proteins. General applications could be made to therapeutic 
development, in the fomi screening small molecule combinatorial 
compound libraries for molecules that induce interactions between 
proteins, that may inhibit the activities of either or both of the proteins, or 
activate specific cellular processes that are initiated by other events, such 
as growth factor^ediated receptor dimerizatbn. The discovenr of such 
small molecules could lead to the development of orally available drugs 
for the treatment of a broad spectrum of human diseases. 

Another example of an induced interaction we have 

« 

studied with the DHFR PCA is the interaction of the oncogene GTPase 
p21 ras and its direct downstream target, the serineAhreonine kinase raf. 
This interaction only occurs when the GTPase is in the GTP-bound fomi. 
whereas tumover of GTP to GDP leads to release of the complex. As 
with the FKBP-mTOR complex, we have demonstrated this induced 
interaction in £ coli. PCA could be used in a general way to study such 
induced interactions, and to screen for compounds that release or 
prevent these interactions in pathological states. The ras-raf interaction 
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itself could be a target of therapeutic intervention. Oncogenic forms of 
ras consist of mutants that are incapable of tuming over GTP and 
therefore remain continuously associated with activated ras. This leads 
to a constitutive uncontrolled grovvth signal that results, in part, in 

5 oncogenesis. The identification of compounds that inhibit this process, 
by PCA, w/ould be of value in broad treatment of cancers. Other 
examples of multimolecular applications of PCA could include 
identification of novel DNA or RNA binding proteins. In Its simplest 
conception one uses a known DNA or RNA binding motifs, for instance 

1 0 a retinoic acid receptor zinc finger, or a simple RNA binding protein such 
as IF-1 , respectively. One half of the PCA consists of the DNA or RNA 
protein binding domain fused to one of the PCA fragments (control 
fragment). The complementary fragment is fused to a cDNA library. A 
third entity, the gene coding for a sequence containing an elenient known 

15 to bind to the control protein, and then a second putative or known 
regulatory element is coded for after this sequence. A test system 
consists of tat/tar elements that control elongation in 
transcription/translation of HIV genes. An example application would be 
identification of tat binding elements that have been proposed to exist in 

20 eukaryotic genomes and may regulate genes in the same or similar way 
to that of HIV genes. (SenGupta D. J. et al. (1996) Proc. Natl. Acad. USA 
6, 8496-8501). 



EXAMPLE 5 

f- yapiples of P^A applications to druQ screening: — ^crggnlnq 
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f\) nr^iq ftcreenino 

Screening combinatorial libraries of compounds for 
those that inhibit or induce protein-protein/protein-ma/protein-DNA 
complexes. The PCA strategy can be directly applied to identifying 
potentially therapeutic molecules contained in combinatorial libraries of 
organic molecules. It is possible to perfom> high throughput screening of 
such libraries to screen for compounds that will inhibit or induce protein- 
protein interactions or protein-DNA/RNA interactions (as discussed 
above). In addition it is also possible to screen for compounds that inhibit 
enzymes whose substrates are other proteins DMA. RNA or 
carbohydrates, as discussed below. In this application, proteins that 
interact/protein substrate pairs, or control DNA/RNA binding protein- 
enzyme pairs are fused to PCA complementary fragments and plasmids 
harboring these pairs are transformed/transfected into a cell, along with 
any third DNA or RNA element as the case requires. 
Transfomied/transacted cells are grown liquid culture in multiwell plates 
where each well is inoculated with a single compound from an array of 
combinatorially synthesized compounds. A readout of a response 
depends on the effect of a compound. Ifthecompound inhibits a protein 
interaction, there is a negative response (no PCA signal is the positive 
response). If the compound induces a protein interaction, the response 
is a positive PCA signal. Controls for non-specific effects of compounds 
include: 1) demonstration tiiat the compound does not effect the PCA 
enzyme itself (test against cells transfected with the wild-type intact 
enzyme used as the PCA probe) and in the case of a cell sun/ival assay, 
that the compound is not toxic to the cells that have not been 
transfornied/transfeded. As well as providing a high throughput assay for 
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biological activity of compounds. PCA also offers the advantage over in 
vitro assays that it is a test for cell membrane pemieability of active 
compounds. Specific demonstrated examples of PCA for drug screening 
in our laboratory include the application of DHFR PCA in E. coli to 

5 detecting compounds that inhibit therapeutically relevant targets. These 
include Bax/Bcl2 fkbp12/tor ras/raf, carboxyl terminal dimerization 
domain of HIV-1 capsid protein, IkB kinase IKK-1 and IKK-2 dimerization 
domains (leucine zippers and helix-loop-helix domains). In each case, 
the two proteins are subcloned 5' upstream of either F[1 ,2] or F[3] as 

10 described above. Plasmids harboring the complementary fragments are 
cotransformed into BL21 cells. Colonies from minimal medium plates 
containing IPTG and trimethoprim are picked, and grown in liquid medium 
under the same selective conditions and frozen stocks made. For a 
single screening cycle, a priming overnight culture is grown from frozen 

15 stocks in LB medium. A selective minimal medium containing 
trimethoprim, ampidllin, IPTG is aliquated at 25 ml into each well of a 384 
well plate. Each well is then inoculated with 1 ul of an individual sample 
from a compound array (ArQule Inc.) to give a final concentration of 10 
uM. Each well is then inoculated with 2 ml of ovemight culture and 

20 plates are incubated in a specially adapted shaker bath at 37C. At 2 hour 
intervals, plates are read on an optical absorption spectroscopic plate 
reader coupled to a PC and spreadsheet software at 600 nm (scattering) 
for a period of 8 hours. Rates of growth are calculated from individual 
time readings for each well and compared to a standard curve. A "hif is 

25 defined as a case where an individual compound reduces the rate of 
grovrth to less than the 95 % confidence interval based on the standard 
deviation for growth rates observed in all of the wells within the test plate. 
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"Near hits" are defined as those cases where growth rates are within the 
95 % confidence interval. For each of the hits or near hits, the following 
controls are then performed: The same experiment is perfomied with 
BL21 cells that are transfomied with empty vector (and no trimethoprim). 

5 with vector harboring the full length mDHFR gene, or with cotransfected 
cells where protein expression is not induced by IPTG. If in all of these 
cases the compound has no effect, it can be concluded that it is 
specifically disrupting the protein-protein interaction being tested. Such 
validated hits or near hits are then retested to establish a dose-response 

10 curve for the individual compound, with concentrations varying from 1 pM 
up to 1 mM by orders of magnitude of 10. The PCA strategy for 
compound screening can also be applied in the multiprotein protein- 
RNA/DNA cases as described above, and can easily be adapted to the 
DHFR or any other PCA in £ cdi or in yeast versions of the same PCAs. 

15 Such screening can also be applied to enzymes whose targets are other 
proteins or nucleic acids for known enzyme/substrate pairs or to novel 
enzyme substrate pairs identified as described below. 
Proteins involved in viral integration processes are examples of targets 
that could be tested for inhibitors using the PCA strategies. Examples for 

20 the HIV virus include: 

i) Inhibition of integrase or the transport of the pre-integration complex: 
protein Ma or vpr. 

25 ii) Inhibition of the cell cycle in G2 by vpr (interaction by cyclin B) causing 
induction of apoptosis. 
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jii) Inhibition of the interaction of gp160 (precursor of the membrane 
proteins) with furine. 

Accessory proteins of HIV as a therapeutic target: 
5 i) Vpr: nuclear localizing sequence (target): interaction site of vpr with 
phosphatasesA. 

ii) vif: interaction with vimentin (cytoskeleton associated protein) . 

0 ii) Vpu: Degradation of CD4 in the RE mediated by the cytoplasmic tail of 
Vpu. 



iii) nef: Myristoylation signal of Nef. 

^ 5 other general targets for drug screening could include 

proteins linked neurodegenerative diseases, such as to alpha-synuclein. 
This protein has been linked to early onset of Parkinson disease and it is 
present also implicated in in Alzheimer disease. There is also b-amyloid 
proteins, linked to Alzheimers disease. 

20 An example of protein-carbohydrate interactions that 

would be a target for drug screening includes the selectins that are 
generally implicated in inflammation. These cell surface glycoproteins are 

directly involved in diapedesis. 

A number of tumor supressor genes whos actions are 

25 mediated by protein-protein interactions could be screened for potential 
anti-cancer compounds. These include PTEN, a tumor supressor directly 
involved in the formation of hannatomas. It is also involved in inherited 
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breast and thyroid cancer. Other interesting tumor supressor genes 
include p53, Rb and BARC1. 



FXAMPLE 6 



F^ymp'«s aPPligati! 



►h o PgA ff tr^feTlY detect 



The examples described above are used for identifying 
novel molecular interactions involving molecules that merely bind to each 
other. However detecting the substrates of enzymes is also fully 
1 0 compatible with the PCA strategy as shown below: 

i) Enzymes that iorm tight complexes or with protein substrates or induce 
efficient PCA fragment assembly or 



ii) Mutant enzymes that bind tightly to substrate but do not undergo 
product release because of mutations residues involved in nucleophilic 
attack and/or product release (substrate trapping). 

Enzymes may form tight complexes with their substrates 
(Kd~1-10mM). In these cases PCA may be efficient enough to detect 
such interactions. However, even if this is not true. PCA may work to 
detect weaker interactions. Generally, if the rate of catalysis and product 
release is slower than the rate of folding- reassembly of the PCA 
complementary fragments, effectively irreversible folding and 
reconstitution of the PCA reporter activity will have occurred. Therefore, 
even if the enzyme and substrate are no longer interacting, the PCA 
signal is detected. Therefore, the detection of novel enzyme substrates 
using PCA may be possible, independent of effective substrate Kd or rate 
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of product release. In cases where product release Is much faster than 
PCA fragment assembly/folding and altemative approach is provided by 
generating "substrate trapping" mutants of the test enzyme. An example 
of this approach applied to the protein tyrosine phosphatase PTP1B. 
where substrate trapping mutants have been generated by mutating the 
nucleophilic aspartate 181 to alanine rendering the enzyme catalyticly 
dead, but capable of fonning tight complexes with a known substrate, the 
EGF receptor and other unknown proteins (Flint, A. J. et al. (1996) Proc. 
Natl. Acad. USA 941680-1685). An application of using PCA to screen 
for interacting partners of PTP1B is given as follows. We use the 
aminoglycoside kinase (AK)-based PCA in transiently transfected COS 
or 293 cells. The substrate trap mutant catalytic domain of PTP1B is 
fused to W-temiinal complementary fragment of AK, while a C-terminal 
fusion of the other AK fragment is made to a cDNA library. Cells are co- 
transfected with complementary AK pairs and grown in selective 
concentrations of G418. After 72 hours, colonies of sun/iving cells are 
picked and in situ PCR is perfomied using primers designed to anneal to 
3' and 5' flanking regions of the cDNA coding region. PCR amplified 
products are then 5' sequenced to identify the gene. 
Enzyme inhibitors Screening combinatorial libraries of compounds for 
those that inhibit enzyme-PROTEIN substrate complexes either with: 

i) Enzymes that fomi tight complexes with protein substrates or 

il) Mutant enzymes that bind tightly to substrate but do not undergo 

product release because of the mutation. 
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EXAMPLE 7 

^PPll^ ^itinns of t hf Pf^-^ '^^■■ateav to nrot^in enqinRerinq/gvplutfPn 

The PCA strategy can be used to generate peptides or 
proteins with novel binding properties that may have therapeutic value, 
as is done today with phage display technology. It is also possible to 
develop enzymes with novel substrate or physical properties for industrial 
enzyme development. Two detailed examples of the application of the 
PCA strategy to these ends are given below, with additional applications 
listed below. 

1) Selection of high-afTinity. heterodimerizing leucine zipper 
sequences (J. Pelletier. K. Amdt. A. Plueckthun and S. Michnick. 
manuscript in preparation). The mDHFR PCA. described above, was 
used in a scheme for the selection of efficiently heterodimerizing. 
designed leucine zippers. It has been proposed that the fom^ation of salt 
bridges between positively and negatively charged residues at 
complementing 0e6 and OgO positions is important in stabilizing leucine 
zipper fonnation. though this view has been contested. In order to help 
define the importance of salt-bridge formation at the e and g posftions. 
two leucine zipper libraries were built. Both are based on the GCN4 
leucine zipper sequence, but contain sequence information specific to 
either Jun or Fos zippers in order to create heterodimerizing pairs. As 
well, the e-1 to e^ and g-1 to g-4 positions in each library were 
randomized to code for positively or negatively charged residues, or 
neutral polar residues. These libraries were amplified by PCR and 
subcloned into the Z-Fll .2] or Z-F131 constructs (described above) from 
which the GCN4 zipper sequences had been removed. The bacterial 
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mDHFR PCA selection was performed on selective solid media, as 
described earlier. Colonies were picked and sequenced; sequence 
analysis reveals that the distribution of charged or neutral residues at e-g 
pairs is not random, but is biased toward pairing of opposite charges, or 

5 pairing of a charged with a neutral residue, rather than same-charge 
pairing (see figure 7). We reasoned that better zipper pairing should lead 
to an increase in efficiency of DHFR-fragment complementation, resulting 
in faster bacterial doubling times (see Table 1 in the mDHFR PCA 
description), and undertook a selection/enrichment of the novel zippers 

10 relative to GCN4. as follows. The designed zipper libraries, expressed as 
A/-temiinal fusions to the DHFR F[1,2] or F[3:I1 14A]. were cotransfomied. 
clones were picked, propagated and mixed in selective liquid culture, and 
the mix was added in a 1:1 000 000 ratio to clone Z-F[1,2] + Z-F[3:I114A] 
(original GCN4 leucine zippers). The mixture was propagated in selective 

15 liquid culture over multiple passages. Restriction analysis shows that 
within 4 passages, the population of GCN4-expressing bacteria is 
diminishing relative to the novel zipper sequences (data not shown), 
indicating that some of the designed zipper-containing clones are 
propagated at a higher rate than those containing GCN4. Bacteria from 

20 later passages were plated on selective medium, and individual clones 
sequenced to reveal the identity of the most successful designed zipper 
pairs (data not shown). 

2) Application of PCA to enzyme function and design PCA 

25 Development Adenosine deaminase (ADA) meets all of the criteria for 
a PCA listed above. ADA is a small (-40 kD), and easily purified 
monomeric zinc metallo-enzyme and the stmcture of murine ADA has 
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been resolved. Several in vitro ADA activity assays have been 
developed. Involving UV spectrophotometry and stopped-flow fluorimetry. 
E. coli ADA catalyzes the irreversible conversion of cytotoxic adenine 

nucleosides to non-toxic inosines . 

5 Eukaryotic or prokaryotic cells propagated in the 

presence of cytotoxic concentrations of adenosine or adenosine analogs 
require ADA to detoxify these compounds. This is the basis of a 
dominant-selection strategy used to select for cells expressing a specific 
gene in mammalian cells. The ADA gene has also been expressed in 

10 SF3834 £. coli cells which lack a gene coding for endogenous ADA. 
When the gene coding for ADA is introduced Into ADA- bacterial DMA. 
those cells that express ADA are able to sunwve high concentrations of 
added adenosine; those that do not. die . This forms the basis of an in 

vivo ADA activity assay. 

^ 5 We chose ADA. principally because it can be used as 

a dominant selective marker in mammalian and bacterial cells where the 
gene has been knocked out The reason we choose dominant selective 
genes is because in screening for novel protein-protein interactions, 
particularly testing for Interactions of a known protein against a library of 

20 millions of independent clones, selection sen/es to filter for cells that may 
show a positive response for reasons having nothing to do with a specific 
protein-protein interaction. We will use three test systems of interacting 
proteins including leucine zipper-fbmiing sequences, the proteins raf and 
p21 and the induced oligomerization system. FK506 binding protein 

25 (FKBP) and mTOR that interact through the macrocyclic immuno- 
suppressant compound rapamycin. For all of these systems, we will 
construct E. coli and mammalian transient transfection plasmids and 
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subclone the test proteins as fusions to ADA complementary fragments. 
The primary assay will be survival of SF3834 £ coli cells that have been 
transfornied with the complementary ADA fragments fused to the test 
oligomerization proteins in the presense of toxic concentrations of 

5 adensosine. We will then purity fusion proteins from colonies of and 
perform in vitro assays of ADA activity as described below. The utility of 
the ADA PCA as a method to identify novel proteins that interact with a 
test bait will be perfonned in mammalian COS-7 and HEK-293T cells 
transiently transfected with FKBP fused to one of the ADA fragments and 

10 the other fragment fused to a cDNA library from normal human spleen 
containing 10® independent clones. As with the £ coli assay, cells that 
survive in a medium containing toxic concentrations of ADA is collected 
and isolated plasmids will be testd to identify the gene for the interacting 
protein by PGR amplification and chain propagation-tennination 

15 techniques. 

Structural motifs required for protein function: Detemiination of the 
structural elements required for the enzymatic function of ADA are 
investigated through alteration of the stmctures of the enzyme fragments. 

20 At first, ADA is cut into two separate domains - one responsible for 
substrate binding (residues 1-210) and one responsible for catalysis 
(residues 211-352). These separate pieces will be attached to known 
assembly domains, such as leucine zippers (see example 1 above). 
Reassembly will restore activity which will be assessed through detailed 

25 in vitro kinetic analysis of the binding and catalytic properties of the re- 
assembled enzyme, using UV spectrophotometry and stopped-flow 
fluorimetry to obsen/e the enzymatic reactions. This system will provide 
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another handle on the manipulation of enzyme activity that will afford a 
powerful tool for enzymatic mechanism study. For example, the 
difference in the kinetic behaviour of the reassembled enzyme on mixing 
with the substrate, compared to enzyme reassembled in presence of 

5 substrate (where substrate may already be bound by binding domain) will 
allow sophisticated level of study of importance of binding energy to 
catalysis. Subsequent point mutations to the functional or assembly 
domains of the proteins will then allow a very subtle perturbation and 
detailed quantification of the relationship of binding energy to catalysis. 

10 This precise control over the structure and assembly of separate 
functional domains of the enzyme will pemiit very sophisticated enzymatic 
structure function studies, the definition of structural motifs and an 
understanding of their role in catalysis. 

15 Novel protein catalyst design: The detailed knowledge of the enzyme 
mechanism gained through determination of the structural requirements 
for catalysis will then be exploited through the combination of these 
functional Obuilding blocksO with the functional motifs responsible for 
substrate binding and catalysis in other enzymes, allowing the generation 

20 of novel protein catalysts. For example, the catalytic motif from ADA is 
modified to a cytidine-binding motif, creating a novel enzyme with 
potentially useful catalytic properties. The activity of these novel 
enzymes can easily be assessed through in vivo assays similar to that 
of the PCA system, or through in vitro activity assays. Furthemiore. the 

25 detailed mechanistic investigation of the resulting enzymes possible with 
this system will pemiit the rational design of each subsequent generation 
of catalysts. 
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EXAMPLE 8 

Fv?> T^p|pR of aPDlicatinns the PCA strategy to detect molecular 

fn ^ ^prtions in wh ple organisms 

It is a logical extension of the descriptions of PCA 

5 applications above to the utility of these techniques in whole model 
organisms such as drosophila, nematodes, zebra fish and puffer fish, as 
examples. The sole differences with other listed examples is that vectors 
used would need to be different (for example retroviral vectors) and that 
any substrates needed by the PCA would need to be bioavailable, or 

10 detection would need to be performed in situ. 

EXAMPLE 9 

{T^^mpl? ^ applications of the PCA strategy to ggn g TheraPY 

Another important embodiment of the invention is to 

1 5 provide a means and method for gene therapy of mammalian disease. Of 
particular interest is the use of PCA therapeutic for treatment of cancer. 
In one embodiment of said PCA gene therapy, a PCA is developed 
employing fragments (modular protein units) derived from a protein toxin 
for example: Pseudomonas exotoxin, Diptheria toxin and the plant toxin 

20 gelonin, or other like molecules. For therapy of breast cancer for example, 
first a mammalian, retroviral, adenoviral, or eukaryotic artificial 
chromosomal (EAC's) genetic construct is prepared that introduces one 
fragment of the selected toxin under the control of the promoter for 
expression of the erbBl oncogene. Its is well known that the ert)B2 

25 oncogene is overexpressed in breast cancer and adenocarcinoma cells 
( D. J. Slamon et. al.. Science. 1989, 244, 707 ). The HERZIneu {c-erbB- 
2) proto-oncogene encodes a sub-class 1 185-kDa transmembrane 
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protein tyrosine kinase growth factor receptor. pISS^^-^. Also, the human 
erbBl oncogene is located on chromosome 17. region q21 and 
comprises 4.480 base pairs and p1 85"^ serves as a receptor for a 30- 
kDa glycoprotein growth factor seaeted by human breast cancer cell lines 
( R. Lupu et. A!., Science. 1990, 249. 1152 ). 

The transgene is introduced 'in vivo' or 'ex-vivo' into 
target cells emptoying methods known by those skilled in the art e.g. 
homologous recombination to insert transgene into locus of interest via 
retroviral, adenoviral or EAC's. A second genetic construct comprising a 
fusion gene containing a target DNA that encodes an interacting protein 
that interacts with erbQl oncogene discovered by the PCA process 
described in this invention and the "second" fragment of the toxin 
molecule. This construct is delivered to the patient by methods known in 
the art for example as shown in U.S. Patent Nos. 5.399.346 and 
5,585.237 whose entire contents are incorporated by reference herein. 
Transgene expression of the erf)B2 oncogene-toxin fragment described 
will now be under the control of the constitutive oncogene promoter. 
ProlifeiBting tumor cells will thus produce one piece of the toxin attached 
as a fusfon to the erf)B2 oncogene. In the presence of the second 
genetic construct expressing the PCA discovered interacting eri)B2 
oncogene "interacting protein - toxin fragment" construct then: erbBZ 
oncogene-toxin fragmentA: interacting protein-toxin fragment 8 will be 
created and induce death of target tumor cells through creation of an 
active toxin through Protein Fragment Complementation and thus provWe 
25 an efficacious and efficient therapy of said disease. 

This can be extended to other diseases and other toxins 
employing techniques described and embodied in this invention. 



15 



20 
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EXAMPLE 10 

f-^ypiplfi ? nf applic ations the PCA strategy to detect moleCUlar 

lnt?r?Tti""s in v/frD 

Any of the PCA strategies described above could be 

5 addapted to in vitro detection. Unlike the in vivo PCAs however, 
detection would be perfonned with purified PCA fragment-fusion proteins. 
Such uses of PCA have the potential for use in diagnostic kits. For 
example the test DHFR assay described above where the interacting 
domains are FKBP12 and TOR could be used as a diagnostic test for 

10 rapamycin concentrations for use in monitoring dossage in patients 

treated with this drug. 

As shown above, the instant invention provides: 

1 ) Allow for the detection of protein-protein interactions in vivo or in vitro. 

2) Allow for the detection of protein-protein interactions in appropriate 
15 contexts, such as within a specific organism, cell type, cellular 

compartment, or organelle. 

3) Allow for the detection of induced versus constitutive protein-protein 
interactions (such as by a cell growth or inhibitory factor). 

4) To be able to distinguish specific-versus non-specific protein-protein 
20 interactions by controlling the sensitivity of the assay. 

5) Allow for the detection of the kinetics of protein assembly in cells. 

6) Allow for screening of cDNA libraries for protein-protein interactions. 

Further aspects of the invention can be demonstrated 
by identifying novel interactions with the enzyme p70S6k, to determine its' 
25 regulation and how separate signaling cascades converge on this 
enzyme. 
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The PCA method is particularly useful for detection of 
the kinetics of protein assembly in cells. The kinetics of protein assembly 
can be determined using fluorescent protein systems. 

In a further embodiment of the invention, PCA can be 

used for drug screening. The techniques of PCA are used to screen for 

drugs that block specific biochemical pathways in cells allowing for a 

carefully targeted and controlled method for identifying products that have 

useful pharmacological properties. 

Although the present invention has been 

desCTibed herein above by way of preferred embodiments thereof, it can 
be modified, without departing from the spirit and nature of the subject 
invention as defined in the appended claims. 
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nH^T !^ f^-LAIMED IS; 

1 . Molecular fragment complementation assays for the 
detection of molecular interactions comprising a reassembly of separate 
fragments of a molecule, wherein reassembly of said fragments is 
operated by the interaction of molecular domains fused to each fragment 
of said molecules, and wherein reassembly of the fragments is 
independent of other molecular processes. 

2. A method for detecting biomolecular interactions said 

method comprising: 

(a) selecting an appropriate reporter molecule; 

(b) effecting fragmentation of said reporter molecule 
such that said fragmentation results in reversible loss of reporter function; 

(c) fusing or attaching fragments of said reporter 
molecule separately to other molecules; followed by 

(d) reassociation of said reporter fragments through 
interactions of the molecules that are fused to said fragments. 

3. The method of claim 2 wherein said reporter 
molecule is a multimeric protein. 



4. The method of claim 2 wherein said reporter 
molecule is an multimeric enzyme. 

25 

5. The method of claim 2 wherein said reporter 
molecule is a multimeric receptor. 
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6. The method of claim 2 wherein said reporter 
molecule is a muttimeric binding protein. 

7. The method of claim 2 wherein said reporter 
5 molecule is a catalytic molecule. 

8. The method of claim 2 wherein said reporter 
molecule is an energy transfer molecule. 

^0 9. The method of claim 2 wherein said reporter 

molecule is a fluorescent or luminescent or phosphorescent protein. 

10. The method of claim 2 wherein said detected 
molecule is a nucleic acid or a ribozyme. 

15 

11. The method of claim 2 wherein said detected 
molecule is a lipid or an oligosaccharide. 

12. The method of claim 2 wherein said detected 
20 molecule is a ligand. 

13. The method of claim 2 wherein said detected 
molecule is a nucleic acid. 

25 14. The method of claim 2 wherein said detected 

molecule is a peptide. 
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15. The method of claim 2 wherein said detected 
molecule is a carbohydrate. 

16. The method of claim 2 wherein said fragmentation 
is effected by a method selected from the group consisting of genetic 
manipulation, synthetic chemistry or de novo synthesis, photochemical or 
enzymatic cleavage, and proteolytic or hydrolytic chemistry. 

17. The method of claim 2 wherein said reassociation 
of the reporter molecule fragments is effected by molecules fused or 
attached to said fragments. 

18. A method of testing biomolecular interactions 

comprising: 

a) generating a first fusion product comprising 
i) a first fragment of a first molecule and 

il) a second molecule which is different or the 

same as said first molecule; 

b) generating a second fusion product comprising 

i) a second fragment of said first molecule; and 

ii) a third molecule which is different from or the 

same as said first molecule or second molecule; 

c) allowing the first and second fusion products to 

contact each other, and 

d) testing for activity regained by association of the 

recombined fragments of the first molecule, wherein said reassociation is 
mediated by interaction of the second and third molecules. 
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1 9. The method of claim 1 8 wherein at least one of said 
second or said third molecules is a protein. 

20. The method of claim 1 8 wherein at least one of said 
5 second or said third molecules is an enzyme. 

2 1 . The method of claim 1 8 wherein at least one of said 
second or said third molecules is a nucleic acid. 

-10 22. A method comprising an assay where fragments of 

a first molecule are fused to a second molecule and fragment association 
is detected by reconstitution of the first molecule's activity. 

23. A composition comprising a product selected from 

1 5 the group consisting of: 

(a) a first fusion product comprising: 

1) a first fragment of a first molecule whose 
fragments can exhibit a detectable activity when associated and 

2) a second molecule that can bind (a)(1); 
20 (b) a second fusion product comprising 

1) a second fragment of said first molecule and 

2) a third molecule that can bind (b)(1); and 
c) both (a) and (b). 

25 24. A composition comprising complementary 

fragments of a first molecule, each fused to separate molecules. 
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25. The composition of Claim 24 wherein the first 
molecule is selected from the group consisting of: multimeric protein, 
murtimeric receptor, multimeric binding protein, catalytic molecule, energy 
transfer molecule, a fluorescent or luminescent or phosphorescent 

protein. 

26. The composition of Claim 24 wherein the first 
molecule is a multimeric enzyme. 

27. The composition of claim 24 wherein the second 
and third molecules can bind to each other. 

28. A composition comprising a nucleic acid molecule 
coding for a fusion product, which molecule comprises sequences coding 
for a product selected from the group consisting of: 

(a) a first fusion product comprising : 

1) fragments of a first molecule whose fragments 

can exhibit a detectable activity when associated and 

2) a second molecule fused to the fragment of the 

20 first molecule; 

(b) a second fusion product comprising 

1) a second fragment of said first molecule and 

2) a second or third molecule; and 

(c) both (a) and (b). 



15 



25 



29. A host cell comprising a composition according to 
either Claim 24 or Claim 28. 
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30. An assay which comprises using either the 
composition according to Claim 24 or Claim 28 or using the host cell 
according to Claim 29. 

5 31. A method of testing for biomolecular interactions 

associated with: (a) complementary fragments of a first molecule whose 
fragments can exhibit a detectable activity when associated or (b) binding 
of two protein-protein interacting domains from a second or third 
molecule, said method comprising: 

10 1) creating a fusion of 

(a) a first fragment of a first molecule whose 
fragments can exhibit a detectable activity when associated and 

(b) a first protein-protein interacting domain; 

2) creating a fusion of 
.5 (a) a second fragment of said first molecule and 

(b) a second protein-protein interacting domain 
that can bind said first protein-protein interacting domain: 

3) allowing the fusions of (1 ) and (2) to contact each 

other; and 

20 4) testing for said activity. 

32. The method according to any one of Claims 1, 2, 
18. 22 or 31 wherein the activity of the fragments of said first molecule is 
controlled by changing parts of the fragments to either increase or 
25 decrease their ability to associate. 
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33. A composition comprising a product selected from 

the group consisting of: 

(a) a first fusion product comprising: 

1) a first fragment of a molecule whose 
fragments can exhibit a detectable activity when associated and 

2) a first protein-protein interacting domain; 

(b) a second fusion product comprising 

1 ) a second fragment of said first molecule and 

2) a second protein-protein interacting domain 
that can bind said first protein-protein interacting domain; and 

(c) both (a) and (b). 



34. A composition comprising a nucleic acid molecule 
coding for a fusion product, which molecule comprises sequences coding 
15 for either: 

(a) a first fusion product comprising: 

1) a first fragment of a molecule whose 
fragments can exhibit a detectable activity when associated and 

2) a first protein-protein interacting domain; or 

(b) a second fusion product comprising 

1 ) a second frag ment of said molecule and 

2) a second protein-protein interacting domain 
that can bind said first protein-protein interacting domain; or 

(c) both (a) and (b). 



35. A host cell comprising a composition according to 
either Claim 33 or Claim 34. 
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36. An assay which comprises using either the 
composition according to Claim 33 or Claim 34 or using the host cell 
according to Claim 35. 

5 37. A method of detecting kinetics of protein assembly 

comprising perfonning PCA. 

38. A method of detecting kinetics of protein assembly 
comprising the method of any one of Claims 1, 2, 18, 22. or 31. 

10 

39. A method of screening a cDNA library comprising 

performing PCA. 

40. A method of screening a cDNA library comprising 
15 themethodofanyoneofClaims1,2, 18, 22, or 31. 

41 . A method according to any one of Claims 1,2, 18, 
22, 31. 39 or 40 wherein a chromogenic, fluorogenic, enzymatic, or other 
optically detectable signal is generated, 

* 

20 

42. A method according to any one of Claims 1, 2, 18, 
22, 31, 39 or 40 wherein dihydrofolate reductase is used as part of a 
signal-generating system. 



25 



43. A method of determining the minimum length of at 
least one of two or more interacting domains comprising performing any 
of the assays of Claims 1, 2, 18, 22, 31, 39 or 40. 
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44. A method of determining whether a molecular 
complex comprises two or more interacUng domains, said method 
comprising performing any of the assays of Claims 1, 2. 18. 22. 31, 39 or 
40. 

45. A method according to any one of Claims 1,2,18. 
22, 31, 39 or 40 wherein at least one other molecule is present which 
causes an Interaction of said second and third molecules. 

46. A method according to any one of Claims 39 or 40 
where there is an interaction between parts of a first or a second 
molecule due to the presence of at least one other molecule which 
mediates the interaction. 

47. A method of testing the ability of a compound to 
inhibit molecular interactions in a PCA comprising performing a PCA in 
the presence of said compound and correlating any inhibition with said 
presence. 

48. A method for detecting protein-protein interactions 
in living organisms and or cells, which method comprises: 

(a) synthesizing probe protein fragments from an 
enzyme which enables dominant selection by dissecting the gene coding 
for the enzyme into at least two fragments; 

(b) constructing fusion proteins with one or more 

molecules that are to be tested for interactions; 
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(c) fusing the proteins obtained in (b) with one or more 

of the probe fragments; 

(d) coexpressing the fusion proteins; and 

(e) detecting the reconstitution of enzyme activity. 

5 

49. The method according to claim 48 wherein the 
enzyme is murine dihydrofolate reductase. 

50. The method according to claim 48 wherein the 
10 fusion proteins have peptides consisting of N- and C-terminal fragments 

of said murine dihydrofolate reductase and are fused to GCN4 leucine 
zipper sequences. 

51. The method according to claim 48 wherein 
15 coexpression of the complementary fusion proteins is catalyzed by the 

binding of the test proteins to each other. 

52. A method for screening or high-throughput 
screening of combinatorial libraries for compounds that trigger or inhibit 

20 protein-protein interactions, characterized by utilizing the method of claim 
48 to identify new drug targets. 

53. The method according to claim 52. wherein the 
targets are biological active proteins. 

25 



wo 98/34120 



101 



PCT/CA98y00068 



54. The method of claim 53, wherein said biological 
active proteins are selected from the group consisting of receptors, 
inhibitors, enzymes or Ion channels. 

55. A method for detecting biomolecular interactions 

said method comprising: 

(a) selecting an appropriate reporter molecule; 

(b) effecting fragmentation of said reporter molecule; 

(c) fusing or attaching fragments of said reporter 

1 0 molecule separately to other molecules; followed by 

(d) reassociation of said reporter fragments through 
interactions of the molecules that are fused to said fragments. 

56. The composition of Claim 55 wherein the first 
15 molecule is selected from the group consisting of: multimeric protein. 

multimeric receptor. muHimeric binding protein, catalytic molecule, energy 
transfer molecule, a fluorescent or luminescent or phosphorescent 
protein. 

2Q 57. The composition of Claim 55 wherein the first 

molecule is multimeric enzyme. 

58. In the method of affecting gene therapy, the step 
which comprises affecting the method of claim 2. 



wo 98/34120 



PCT/CA98/00068 



//7 



Transfection/ 
Transformation 




SUBSTRATE PRODUCT 



Host \ 
Cell / 




ASSAY 
COLOROMErHC 
R-UOROWETRIC 
SURVIVAL 



Transf ect ion/ 
Transformation 




/' Host 



P7i 








i 




SUBSTRATE PRODUCT 



ASSAY 
COLGROMETRIC 
I=LUOROI\/ETFIC 
SURVIVAL 



SUBSTITUTE SHEET (RULE 26) 



2/7 



PCT/CA98/00068 



CO 




in 
CD 



C\3j 






CO 


ilii 










Zipper 




Zipper 














CO 








CD 




CO 



CO 




cu 



CD 



I 

(SI 



CO 



C\2 



S-i 

o 



SUBSTITUTE SHEET (RULE 26) 



wo 98/34120 



PCr/CA98/00068 



3/7 




SUBSTITUTE SHEET (RULE 26) 




SUBSTITUTE SHEET (RULE 26) 



wo 98/34120 

5/7 

Creation of DHFR fragments: 

Stop 



PCT/CA98/(K)068 




Unique 

restriction site 





:i.2. 






3 






1.2. 












3 

















insertion of fragments into pQE-32 for bacterial screening, 
or pMT3 or Zap Express for eulcaryotic screening: 

Note: for eukaryotic expression, 
the expressed constructs have 
no hexahistidine tag. 



6His 



3 



Insertion of targets or unicnowns (including directional 
insertion of cDM library): 



ggg-{l3T Hunknown 



Survival assay: 



or MMOlMligl 



6His 



3 



unlEown 



Transform 
t or transfect 



Cotransforni 
or cotransfect 






with selective with selective 

pressure pressure 

Identification of clones (for library screening only): 

-Propagation of cells (bacterial or eukaryotic) 
-Isolation of plasmid DNA and insert sequencing 
-For bacterial screening only: overexpression and one-step 
purification of fusion products by the hexahistidine tag 

i-xc=r - 5 

SUBSTITUTE SHEET (RULE 26) 



wo 98/34120 



PCT/CA98/00068 




SUBSTITUTE SHEET (RULE 26) 



wo 98/34120 



PCT/CA98/00068 



7/7 




COOH 



COOH 



•fx/ — \ _' 7k 



1 



1 



■ •::t;: 



6 



8 



9 



10 



■■-m 



11 



ATTRACTIVE PAIRING 

■ charge : charge 

H charge : neutral polar 

□ neutral polar : neutral polar 



12 





■ . ■ . 










« * • 1 














* • * • 













13 



14 



REPULSIVE PAIRING 

m charge : charge 



SUBSTITUTE SHEET (RULE 26) 



TIONAL SEARCH REPORT 




application No 

PCT/CA 98/00068 



A. CLASSIFICATION OF SUBJECT MATTER miW^^/C;?'^ 

IPC 6 G01N33/68 G01N33/58 GOlNiJ/b/J 



According to intemafonal Patent Classmcat>on (IPC) or to both national dassification and IPC. 



Minimum documentat«n searched (ciassHK^al^n syslem lollowed tjy cla*«f»cation symbola) 
IPC 6 GOIN 





B. FIELDS SEARCHED 



Documentatton searched other than 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category ' 



x.p 



Cllalion o( <locument. wilft indication, where appropriate, ot Ih. rrtevani passages 



J.N. PELLETIER ET AL.: "A protein 
complementation assay for detection of 
protein-protein interactions in vivo 
PROTEIN ENGINEERING, 

vol. 10, no. sup. 1 October 1997. OXFORD 
UK, 

page 89 XP002064563 
see the whole document 

US 5 362 625 A (M. KREVON ET AL.) 8 

November 1994 

cited in the application 

see the whole document 

-/- 



Relevani lo daim No. 



1-58 



2-58 



1^ FurthwdocumennarelisledirthecordirioatlonolboxC. 



0 



Patent taniily memlserft are listed in annex. 



* Speaal categories d cted documents : 

"A- docufDenl delinrig the general state ol the art which is not 

considered to be ol particijlar relevance 
"E" earlier document but put)*shed on or alter the international 

filing date 

T* document which may throw doubts on pnority daim(s) or 
^^fe died to establish the pU>licationdate ol another 

citation or other epectal reason (as specified) 
•0" document »f ernr»g to an oral disctosure. use. exhibition or 

other means 

"P- document published prior lo the miemationat filing dale but 
later than the priority date daimed 

Dale of the actual completion of theinter national seardi 



11 May 1998 

Nams and marting address of the ISA 

European Patent Ofttce. P B. 5818 Patertiaan 2 
NL • 2280 HV Rijsv^k 
Tel (+31-70) 340^040. Tx. 31 651 epo nl. 
Fax: (♦31-70) 340-3016 



-r- later document pubttshed after the international filing date 
or priority dale and not in conllid with the application but 
cited to understand the principle or theory underlying the 
inwntion 

"X" document of particular relevance; the daimed inventiori 
cannot be considered novel or cannot be considered lo 
involve an inventive step when the document is taken alone 

"Y" documefTt ot particular relevance: the daimed Invertion 

cannot be considered to invoivs an inventive step when the 
document is combined with one or more other such dooj- 
ments. such combination being obvious to a person akiNdd 
in the art. 

"4" document member of the same patent (amity 
Date of maiing of the international search report 



26/05/1998 



Authorized officer 



Van Bohemen, C 




Fom» PCT/lSA«10<ftcond (A^y 1W2) 



page 1 of 2 




INTERNATIONAL SEARCH REPORT 




lnt( t>n«l Application No 

PCT/CA 98/00068 



C.(Continu«tion> DOCUMENTS CONSIDERED TO BE RELEVANT 



Category 



Y 

X,P 



A.P 



Cilalion ol docunwnt with ndication.vrtww appropnale. o( the relevanl passages 

N. JOHNSSON ET AL.: "Split ubiquitin as 
sensor of protein interactions in vivo" 
PROCEEDINGS OF THE NATIONAL ACADEMY OF 
SCIENCES USA, 

vol. 91, 1 October 1994, WASHINGTON DC 
USA. 

pages 10340-10344, XP002064564 
cited in the application 
see the whole document 

F. ROSSI ET AL. : "Monitoring 
protein-protein Interactions in Intact 
eukaryotic cells by beta-gal actosidase 
complementation. " 

PROCEEDINGS OF THE NATIONAL ACADEMY OF 
SCIENCES USA, 

vol. 94, I October 1997, WASHINGTON DC 

USA, 

pages 8405-8410, XP002064565 
cited in the application 
see the whole document 



Relevan to cuum No. 



2-58 
1 



2-58 



Foim PCT/ISA01 0 (oomruMon d -cere {Mi 1 9K) 



page 2 of 2 



TIONAL SEARCH REPORT 

.iformation on patent family memt>efs 




Application No 

98/00068 



Patent document 
cited in search report 



US 5362625 



Publication 
date 

08-11-1994 



Patent family 
meml)er(s) 



AT 


162891 T 


AU 


657532 B 


AU 


1629192 A 


CA 


2068190 A,C 


DE 


69224231 0 


EP 


0514173 A 


JP 


5276959 A 



Publication 
date 



15- 02- 

16- 03- 
11-03- 
16-11- 
05-03- 
19-11- 
26-10- 



1998 

1995 
1993 
1992 
•1998 
■1992 
-1993 



